Vision OCR API, gRPC: TextRecognitionService.Recognize
To send the image for text recognition.
gRPC request
rpc Recognize (RecognizeTextRequest) returns (stream RecognizeTextResponse)
RecognizeTextRequest
{
// Includes only one of the fields `content`
"content": "bytes",
// end of the list of possible fields
"mime_type": "string",
"language_codes": [
"string"
],
"model": "string"
}
|
Field |
Description |
|
content |
bytes Bytes with data Includes only one of the fields |
|
mime_type |
string Specifications of the (MIME type
|
|
language_codes[] |
string List of the languages to recognize text. |
|
model |
string Model to use for text detection. |
RecognizeTextResponse
{
"text_annotation": {
"width": "int64",
"height": "int64",
"blocks": [
{
"bounding_box": {
"vertices": [
{
"x": "int64",
"y": "int64"
}
]
},
"lines": [
{
"bounding_box": {
"vertices": [
{
"x": "int64",
"y": "int64"
}
]
},
"text": "string",
"words": [
{
"bounding_box": {
"vertices": [
{
"x": "int64",
"y": "int64"
}
]
},
"text": "string",
"entity_index": "int64",
"text_segments": [
{
"start_index": "int64",
"length": "int64"
}
]
}
],
"text_segments": [
{
"start_index": "int64",
"length": "int64"
}
],
"orientation": "Angle"
}
],
"languages": [
{
"language_code": "string"
}
],
"text_segments": [
{
"start_index": "int64",
"length": "int64"
}
],
"layout_type": "LayoutType"
}
],
"entities": [
{
"name": "string",
"text": "string"
}
],
"tables": [
{
"bounding_box": {
"vertices": [
{
"x": "int64",
"y": "int64"
}
]
},
"row_count": "int64",
"column_count": "int64",
"cells": [
{
"bounding_box": {
"vertices": [
{
"x": "int64",
"y": "int64"
}
]
},
"row_index": "int64",
"column_index": "int64",
"column_span": "int64",
"row_span": "int64",
"text": "string",
"text_segments": [
{
"start_index": "int64",
"length": "int64"
}
]
}
]
}
],
"full_text": "string",
"rotate": "Angle",
"markdown": "string",
"pictures": [
{
"bounding_box": {
"vertices": [
{
"x": "int64",
"y": "int64"
}
]
},
"score": "double"
}
]
},
"page": "int64"
}
|
Field |
Description |
|
text_annotation |
Recognized text blocks in page or text from entities. |
|
page |
int64 Page number in PDF file. |
TextAnnotation
|
Field |
Description |
|
width |
int64 Page width in pixels. |
|
height |
int64 Page height in pixels. |
|
blocks[] |
Recognized text blocks in this page. |
|
entities[] |
Recognized entities. |
|
tables[] |
|
|
full_text |
string Full text recognized from image. |
|
rotate |
enum Angle Angle of image rotation.
|
|
markdown |
string Full markdown (without pictures inside) from image. Available only in markdown and math-markdown models. |
|
pictures[] |
List of pictures locations from image. |
Block
|
Field |
Description |
|
bounding_box |
Area on the page where the text block is located. |
|
lines[] |
Recognized lines in this block. |
|
languages[] |
A list of detected languages |
|
text_segments[] |
Block position from full_text string. |
|
layout_type |
enum LayoutType Block layout type.
|
Polygon
|
Field |
Description |
|
vertices[] |
The bounding polygon vertices. |
Vertex
|
Field |
Description |
|
x |
int64 X coordinate in pixels. |
|
y |
int64 Y coordinate in pixels. |
Line
|
Field |
Description |
|
bounding_box |
Area on the page where the line is located. |
|
text |
string Recognized text. |
|
words[] |
Recognized words. |
|
text_segments[] |
Line position from full_text string. |
|
orientation |
enum Angle Angle of line rotation.
|
Word
|
Field |
Description |
|
bounding_box |
Area on the page where the word is located. |
|
text |
string Recognized word value. |
|
entity_index |
int64 ID of the recognized word in entities array. |
|
text_segments[] |
Word position from full_text string. |
TextSegments
|
Field |
Description |
|
start_index |
int64 Start character position from full_text string. |
|
length |
int64 Text segment length. |
DetectedLanguage
|
Field |
Description |
|
language_code |
string Detected language code. |
Entity
|
Field |
Description |
|
name |
string Entity name. |
|
text |
string Recognized entity text. |
Table
|
Field |
Description |
|
bounding_box |
Area on the page where the table is located. |
|
row_count |
int64 Number of rows in table. |
|
column_count |
int64 Number of columns in table. |
|
cells[] |
Table cells. |
TableCell
|
Field |
Description |
|
bounding_box |
Area on the page where the table cell is located. |
|
row_index |
int64 Row index. |
|
column_index |
int64 Column index. |
|
column_span |
int64 Column span. |
|
row_span |
int64 Row span. |
|
text |
string Text in cell. |
|
text_segments[] |
Table cell position from full_text string. |
Picture
|
Field |
Description |
|
bounding_box |
Area on the page where the picture is located. |
|
score |
double Confidence score of picture location. |