Yandex Cloud
Search
Contact UsGet started
  • Blog
  • Pricing
  • Documentation
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • ML & AI
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Customer Stories
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Education and Science
    • Yandex Cloud Partner program
  • Blog
  • Pricing
  • Documentation
© 2025 Direct Cursus Technology L.L.C.
Yandex Vision OCR
  • Getting started
  • Access management
  • Pricing policy
    • Authentication with the API
        • Overview
          • Overview
          • Recognize
  • Release notes
  • FAQ

In this article:

  • HTTP request
  • Body parameters
  • Response
  • TextAnnotation
  • Block
  • Polygon
  • Vertex
  • Line
  • Word
  • TextSegments
  • DetectedLanguage
  • Entity
  • Table
  • TableCell
  1. API references
  2. OCR API
  3. REST
  4. TextRecognition
  5. Recognize

Vision OCR API, REST: TextRecognition.Recognize

Written by
Yandex Cloud
Updated at November 26, 2024
  • HTTP request
  • Body parameters
  • Response
  • TextAnnotation
  • Block
  • Polygon
  • Vertex
  • Line
  • Word
  • TextSegments
  • DetectedLanguage
  • Entity
  • Table
  • TableCell

To send the image for text recognition.

HTTP requestHTTP request

POST https://ocr.api.cloud.yandex.net/ocr/v1/recognizeText

Body parametersBody parameters

{
  // Includes only one of the fields `content`
  "content": "string",
  // end of the list of possible fields
  "mimeType": "string",
  "languageCodes": [
    "string"
  ],
  "model": "string"
}

Field

Description

content

string (bytes)

Bytes with data

Includes only one of the fields content.

mimeType

string

Specifications of the (MIME type). Each specification contains the file to analyze and features to use for analysis. Restrictions:

  • Supported file formats: JPEG, PNG, PDF.
  • Maximum file size: see documentation.
  • Image size should not exceed 20M pixels (length x width).
  • The number of pages in a PDF file should not exceed 1.

languageCodes[]

string

List of the languages to recognize text.
Specified in ISO 639-1 format (for example, ru).

model

string

Model to use for text detection.

ResponseResponse

HTTP Code: 200 - OK

{
  "textAnnotation": {
    "width": "string",
    "height": "string",
    "blocks": [
      {
        "boundingBox": {
          "vertices": [
            {
              "x": "string",
              "y": "string"
            }
          ]
        },
        "lines": [
          {
            "boundingBox": {
              "vertices": [
                {
                  "x": "string",
                  "y": "string"
                }
              ]
            },
            "text": "string",
            "words": [
              {
                "boundingBox": {
                  "vertices": [
                    {
                      "x": "string",
                      "y": "string"
                    }
                  ]
                },
                "text": "string",
                "entityIndex": "string",
                "textSegments": [
                  {
                    "startIndex": "string",
                    "length": "string"
                  }
                ]
              }
            ],
            "textSegments": [
              {
                "startIndex": "string",
                "length": "string"
              }
            ],
            "orientation": "string"
          }
        ],
        "languages": [
          {
            "languageCode": "string"
          }
        ],
        "textSegments": [
          {
            "startIndex": "string",
            "length": "string"
          }
        ]
      }
    ],
    "entities": [
      {
        "name": "string",
        "text": "string"
      }
    ],
    "tables": [
      {
        "boundingBox": {
          "vertices": [
            {
              "x": "string",
              "y": "string"
            }
          ]
        },
        "rowCount": "string",
        "columnCount": "string",
        "cells": [
          {
            "boundingBox": {
              "vertices": [
                {
                  "x": "string",
                  "y": "string"
                }
              ]
            },
            "rowIndex": "string",
            "columnIndex": "string",
            "columnSpan": "string",
            "rowSpan": "string",
            "text": "string",
            "textSegments": [
              {
                "startIndex": "string",
                "length": "string"
              }
            ]
          }
        ]
      }
    ],
    "fullText": "string",
    "rotate": "string"
  },
  "page": "string"
}

Field

Description

textAnnotation

TextAnnotation

Recognized text blocks in page or text from entities.

page

string (int64)

Page number in PDF file.

TextAnnotationTextAnnotation

Field

Description

width

string (int64)

Page width in pixels.

height

string (int64)

Page height in pixels.

blocks[]

Block

Recognized text blocks in this page.

entities[]

Entity

Recognized entities.

tables[]

Table

fullText

string

Full text recognized from image.

rotate

enum (Angle)

Angle of image rotation.

  • ANGLE_UNSPECIFIED
  • ANGLE_0
  • ANGLE_90
  • ANGLE_180
  • ANGLE_270

BlockBlock

Field

Description

boundingBox

Polygon

Area on the page where the text block is located.

lines[]

Line

Recognized lines in this block.

languages[]

DetectedLanguage

A list of detected languages

textSegments[]

TextSegments

Block position from full_text string.

PolygonPolygon

Field

Description

vertices[]

Vertex

The bounding polygon vertices.

VertexVertex

Field

Description

x

string (int64)

X coordinate in pixels.

y

string (int64)

Y coordinate in pixels.

LineLine

Field

Description

boundingBox

Polygon

Area on the page where the line is located.

text

string

Recognized text.

words[]

Word

Recognized words.

textSegments[]

TextSegments

Line position from full_text string.

orientation

enum (Angle)

Angle of line rotation.

  • ANGLE_UNSPECIFIED
  • ANGLE_0
  • ANGLE_90
  • ANGLE_180
  • ANGLE_270

WordWord

Field

Description

boundingBox

Polygon

Area on the page where the word is located.

text

string

Recognized word value.

entityIndex

string (int64)

ID of the recognized word in entities array.

textSegments[]

TextSegments

Word position from full_text string.

TextSegmentsTextSegments

Field

Description

startIndex

string (int64)

Start character position from full_text string.

length

string (int64)

Text segment length.

DetectedLanguageDetectedLanguage

Field

Description

languageCode

string

Detected language code.

EntityEntity

Field

Description

name

string

Entity name.

text

string

Recognized entity text.

TableTable

Field

Description

boundingBox

Polygon

Area on the page where the table is located.

rowCount

string (int64)

Number of rows in table.

columnCount

string (int64)

Number of columns in table.

cells[]

TableCell

Table cells.

TableCellTableCell

Field

Description

boundingBox

Polygon

Area on the page where the table cell is located.

rowIndex

string (int64)

Row index.

columnIndex

string (int64)

Column index.

columnSpan

string (int64)

Column span.

rowSpan

string (int64)

Row span.

text

string

Text in cell.

textSegments[]

TextSegments

Table cell position from full_text string.

Was the article helpful?

Previous
Overview
Next
Release notes
© 2025 Direct Cursus Technology L.L.C.