Asynchronous recognition API v2
To use the API v2, you will need:
- Yandex Object Storage bucket to which you will upload your audio file for recognition.
- Service account with the
ai.speechkit-stt.user
andstorage.uploader
roles needed for accessing SpeechKit and Object Storage. - IAM token or API key for authentication.
For more information on getting started, see How to asynchronously recognize pre-recorded audio.
Warning
Please note that you can only recognize audio files asynchronously under a service account. Do not use any other accounts in Yandex Cloud for that.
The asynchronous recognition service for the API v2 is located at: transcribe.api.cloud.yandex.net/speech/stt/v2/longRunningRecognize
Sending a file for recognition
Parameters in the request body
The request body structure is as follows:
{
"config": {
"specification": {
"languageCode": "string",
"model": "string",
"profanityFilter": boolean,
"literature_text": boolean,
"audioEncoding": "string",
"sampleRateHertz": integer,
"audioChannelCount": integer,
"rawResults": boolean
}
},
"audio": {
"uri": "string"
}
}
Parameter |
Description |
config |
object |
config. |
object |
config. |
string |
config. |
string |
config. |
boolean
|
config. |
boolean |
config. |
string
|
config. |
integer (int64)
|
config. |
integer (int64) |
config. |
boolean
|
audio. |
string |
Response
If your request is written correctly, the service returns the Operation object with the recognition operation ID (id
):
{
"done": false,
"id": "e03sup6d5h1q********",
"createdAt": "2019-04-21T22:49:29Z",
"createdBy": "ajes08feato8********",
"modifiedAt": "2019-04-21T22:49:29Z"
}
Use this ID at the next step.
Getting recognition results
To check the operation status and get the recognition results, submit a request at operation.api.cloud.yandex.net
.
Monitor the recognition results using the obtained ID. The number of result monitoring requests is limited: it takes about 10 seconds to recognize 1 minute of a single-channel audio file.
Warning
Recognition results are stored on the 3 days server. You can then request the recognition results using the obtained ID.
Path parameters
Parameter | Description |
---|---|
operationId | Operation ID received when sending the recognition request |
Response
The Operation object is returned in response to your request. Response example:
{
"done": true,
"response": {
"@type": "type.googleapis.com/yandex.cloud.ai.stt.v2.LongRunningRecognitionResponse",
"chunks": [
{
"alternatives": [
{
"words": [
{
"startTime": "0.879999999s",
"endTime": "1.159999992s",
"word": "when",
"confidence": 1
},
{
"startTime": "1.219999995s",
"endTime": "1.539999988s",
"word": "writing",
"confidence": 1
},
...
],
"text": "when writing The Hobbit, Tolkien referred to the Norse mythology of the Old English poem Beowulf",
"confidence": 1
}
],
"channelTag": "1"
},
...
]
},
"id": "e03sup6d5h1q********",
"createdAt": "2019-04-21T22:49:29Z",
"createdBy": "ajes08feato8********",
"modifiedAt": "2019-04-21T22:49:36Z"
}
Parameter |
Description |
done |
boolean |
response |
object |
response. |
string |
response. |
array |
response. |
array |
response. |
array |
response. |
string |
response. |
string |
response. |
string |
response. |
integer (int64) |
response. |
string |
response. |
integer (int64) |
response. |
string |
id |
string |
createdAt |
google.protobuf.Timestamp |
createdBy |
string |
modifiedAt |
google.protobuf.Timestamp |
For more information about the response format and codes, see Response format.