Yandex Cloud
Search
Contact UsGet started
  • Blog
  • Pricing
  • Documentation
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • ML & AI
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Customer Stories
    • Start testing with double trial credits
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Education and Science
    • Yandex Cloud Partner program
  • Blog
  • Pricing
  • Documentation
© 2025 Direct Cursus Technology L.L.C.
Yandex Vision OCR
  • Getting started
  • Access management
  • Pricing policy
    • Authentication with the API
        • Overview
          • Overview
          • Recognize
  • Release notes
  • FAQ

In this article:

  • gRPC request
  • RecognizeTextRequest
  • RecognizeTextResponse
  • TextAnnotation
  • Block
  • Polygon
  • Vertex
  • Line
  • Word
  • TextSegments
  • DetectedLanguage
  • Entity
  • Table
  • TableCell
  1. API references
  2. OCR API
  3. gRPC
  4. TextRecognition
  5. Recognize

Vision OCR API, gRPC: TextRecognitionService.Recognize

Written by
Yandex Cloud
Updated at November 26, 2024
  • gRPC request
  • RecognizeTextRequest
  • RecognizeTextResponse
  • TextAnnotation
  • Block
  • Polygon
  • Vertex
  • Line
  • Word
  • TextSegments
  • DetectedLanguage
  • Entity
  • Table
  • TableCell

To send the image for text recognition.

gRPC requestgRPC request

rpc Recognize (RecognizeTextRequest) returns (stream RecognizeTextResponse)

RecognizeTextRequestRecognizeTextRequest

{
  // Includes only one of the fields `content`
  "content": "bytes",
  // end of the list of possible fields
  "mime_type": "string",
  "language_codes": [
    "string"
  ],
  "model": "string"
}

Field

Description

content

bytes

Bytes with data

Includes only one of the fields content.

mime_type

string

Specifications of the (MIME type). Each specification contains the file to analyze and features to use for analysis. Restrictions:

  • Supported file formats: JPEG, PNG, PDF.
  • Maximum file size: see documentation.
  • Image size should not exceed 20M pixels (length x width).
  • The number of pages in a PDF file should not exceed 1.

language_codes[]

string

List of the languages to recognize text.
Specified in ISO 639-1 format (for example, ru).

model

string

Model to use for text detection.

RecognizeTextResponseRecognizeTextResponse

{
  "text_annotation": {
    "width": "int64",
    "height": "int64",
    "blocks": [
      {
        "bounding_box": {
          "vertices": [
            {
              "x": "int64",
              "y": "int64"
            }
          ]
        },
        "lines": [
          {
            "bounding_box": {
              "vertices": [
                {
                  "x": "int64",
                  "y": "int64"
                }
              ]
            },
            "text": "string",
            "words": [
              {
                "bounding_box": {
                  "vertices": [
                    {
                      "x": "int64",
                      "y": "int64"
                    }
                  ]
                },
                "text": "string",
                "entity_index": "int64",
                "text_segments": [
                  {
                    "start_index": "int64",
                    "length": "int64"
                  }
                ]
              }
            ],
            "text_segments": [
              {
                "start_index": "int64",
                "length": "int64"
              }
            ],
            "orientation": "Angle"
          }
        ],
        "languages": [
          {
            "language_code": "string"
          }
        ],
        "text_segments": [
          {
            "start_index": "int64",
            "length": "int64"
          }
        ]
      }
    ],
    "entities": [
      {
        "name": "string",
        "text": "string"
      }
    ],
    "tables": [
      {
        "bounding_box": {
          "vertices": [
            {
              "x": "int64",
              "y": "int64"
            }
          ]
        },
        "row_count": "int64",
        "column_count": "int64",
        "cells": [
          {
            "bounding_box": {
              "vertices": [
                {
                  "x": "int64",
                  "y": "int64"
                }
              ]
            },
            "row_index": "int64",
            "column_index": "int64",
            "column_span": "int64",
            "row_span": "int64",
            "text": "string",
            "text_segments": [
              {
                "start_index": "int64",
                "length": "int64"
              }
            ]
          }
        ]
      }
    ],
    "full_text": "string",
    "rotate": "Angle"
  },
  "page": "int64"
}

Field

Description

text_annotation

TextAnnotation

Recognized text blocks in page or text from entities.

page

int64

Page number in PDF file.

TextAnnotationTextAnnotation

Field

Description

width

int64

Page width in pixels.

height

int64

Page height in pixels.

blocks[]

Block

Recognized text blocks in this page.

entities[]

Entity

Recognized entities.

tables[]

Table

full_text

string

Full text recognized from image.

rotate

enum Angle

Angle of image rotation.

  • ANGLE_UNSPECIFIED
  • ANGLE_0
  • ANGLE_90
  • ANGLE_180
  • ANGLE_270

BlockBlock

Field

Description

bounding_box

Polygon

Area on the page where the text block is located.

lines[]

Line

Recognized lines in this block.

languages[]

DetectedLanguage

A list of detected languages

text_segments[]

TextSegments

Block position from full_text string.

PolygonPolygon

Field

Description

vertices[]

Vertex

The bounding polygon vertices.

VertexVertex

Field

Description

x

int64

X coordinate in pixels.

y

int64

Y coordinate in pixels.

LineLine

Field

Description

bounding_box

Polygon

Area on the page where the line is located.

text

string

Recognized text.

words[]

Word

Recognized words.

text_segments[]

TextSegments

Line position from full_text string.

orientation

enum Angle

Angle of line rotation.

  • ANGLE_UNSPECIFIED
  • ANGLE_0
  • ANGLE_90
  • ANGLE_180
  • ANGLE_270

WordWord

Field

Description

bounding_box

Polygon

Area on the page where the word is located.

text

string

Recognized word value.

entity_index

int64

ID of the recognized word in entities array.

text_segments[]

TextSegments

Word position from full_text string.

TextSegmentsTextSegments

Field

Description

start_index

int64

Start character position from full_text string.

length

int64

Text segment length.

DetectedLanguageDetectedLanguage

Field

Description

language_code

string

Detected language code.

EntityEntity

Field

Description

name

string

Entity name.

text

string

Recognized entity text.

TableTable

Field

Description

bounding_box

Polygon

Area on the page where the table is located.

row_count

int64

Number of rows in table.

column_count

int64

Number of columns in table.

cells[]

TableCell

Table cells.

TableCellTableCell

Field

Description

bounding_box

Polygon

Area on the page where the table cell is located.

row_index

int64

Row index.

column_index

int64

Column index.

column_span

int64

Column span.

row_span

int64

Row span.

text

string

Text in cell.

text_segments[]

TextSegments

Table cell position from full_text string.

Was the article helpful?

Previous
Overview
Next
Overview
© 2025 Direct Cursus Technology L.L.C.