Vision OCR API, gRPC: TextRecognitionAsyncService
Статья создана
Обновлена 3 октября 2024 г.
A set of methods for managing operations for asynchronous API requests.
Call | Description |
---|---|
Recognize | To send the image for asynchronous text recognition. |
GetRecognition | To get recognition results. |
Calls TextRecognitionAsyncService
Recognize
To send the image for asynchronous text recognition.
rpc Recognize (RecognizeTextRequest) returns (operation.Operation)
Response of Operation:
Operation.response:google.protobuf.Empty
RecognizeTextRequest
Field | Description |
---|---|
source | oneof: content |
content | bytes Bytes with data |
mime_type | string Specifications of the (MIME type
|
language_codes[] | string List of the languages to recognize text. Specified in ISO 639-1 ru ). |
model | string Model to use for text detection. The maximum string length in characters is 50. |
Operation
Field | Description |
---|---|
id | string ID of the operation. |
description | string Description of the operation. 0-256 characters long. |
created_at | google.protobuf.Timestamp Creation timestamp. |
created_by | string ID of the user or service account who initiated the operation. |
modified_at | google.protobuf.Timestamp The time when the Operation resource was last modified. |
done | bool If the value is false , it means the operation is still in progress. If true , the operation is completed, and either error or response is available. |
metadata | google.protobuf.Any Service-specific metadata associated with the operation. It typically contains the ID of the target resource that the operation is performed on. Any method that returns a long-running operation should document the metadata type, if any. |
result | oneof: error or response The operation result. If done == false and there was no failure detected, neither error nor response is set. If done == false and there was a failure detected, error is set. If done == true , exactly one of error or response is set. |
error | google.rpc.Status The error result of the operation in case of failure or cancellation. |
response | google.protobuf.Any if operation finished successfully. |
GetRecognition
To get recognition results.
rpc GetRecognition (GetRecognitionRequest) returns (stream RecognizeTextResponse)
GetRecognitionRequest
Field | Description |
---|---|
operation_id | string Required. Operation ID of async recognition request. The maximum string length in characters is 50. |
RecognizeTextResponse
Field | Description |
---|---|
text_annotation | TextAnnotation Recognized text blocks in page or text from entities. |
page | int64 Page number in PDF file. |
TextAnnotation
Field | Description |
---|---|
width | int64 Page width in pixels. |
height | int64 Page height in pixels. |
blocks[] | Block Recognized text blocks in this page. |
entities[] | Entity Recognized entities. |
tables[] | Table |
full_text | string Full text recognized from image. |
rotate | enum Angle Angle of image rotation. |
Block
Field | Description |
---|---|
bounding_box | Polygon Area on the page where the text block is located. |
lines[] | Line Recognized lines in this block. |
languages[] | DetectedLanguage A list of detected languages |
text_segments[] | TextSegments Block position from full_text string. |
DetectedLanguage
Field | Description |
---|---|
language_code | string Detected language code. |
Polygon
Field | Description |
---|---|
vertices[] | Vertex The bounding polygon vertices. |
Vertex
Field | Description |
---|---|
x | int64 X coordinate in pixels. |
y | int64 Y coordinate in pixels. |
Line
Field | Description |
---|---|
bounding_box | Polygon Area on the page where the line is located. |
text | string Recognized text. |
words[] | Word Recognized words. |
text_segments[] | TextSegments Line position from full_text string. |
orientation | enum Angle Angle of line rotation. |
Word
Field | Description |
---|---|
bounding_box | Polygon Area on the page where the word is located. |
text | string Recognized word value. |
entity_index | int64 ID of the recognized word in entities array. |
text_segments[] | TextSegments Word position from full_text string. |
TextSegments
Field | Description |
---|---|
start_index | int64 Start character position from full_text string. |
length | int64 Text segment length. |
Entity
Field | Description |
---|---|
name | string Entity name. |
text | string Recognized entity text. |
Table
Field | Description |
---|---|
bounding_box | Polygon Area on the page where the table is located. |
row_count | int64 Number of rows in table. |
column_count | int64 Number of columns in table. |
cells[] | TableCell Table cells. |
TableCell
Field | Description |
---|---|
bounding_box | Polygon Area on the page where the table cell is located. |
row_index | int64 Row index. |
column_index | int64 Column index. |
column_span | int64 Column span. |
row_span | int64 Row span. |
text | string Text in cell. |
text_segments[] | TextSegments Table cell position from full_text string. |