Synchronous Recognition API
Written by
Updated at August 29, 2024
With Synchronous Recognition API, you can transcribe prepared audio files with the following characteristics:
- Maximum file size: 1 MB
- Maximum duration: 30 seconds
- Maximum number of audio channels: 1
The synchronous recognition service is located at: stt.api.cloud.yandex.net/speech/v1/stt:recognize
Query parameters
Parameter | Description |
---|---|
lang | string Language that recognition will be performed for. See a list of available languages in the model description. The default value is ru-RU (Russian). |
topic | string Language model to use for recognition. The closer the model is matched, the better is the recognition result. You can only specify one model per request. Acceptable values depend on the selected language. The default value is general . |
profanityFilter | boolean This parameter manages the profanity filter in recognized speech. Acceptable values include:
|
rawResults | boolean Flag that toggles spelling out numbers. true : Spell out. false (default): Write as numbers. |
format | string Format of the audio being provided. Acceptable values include:
|
sampleRateHertz | string Sampling frequency of the audio being provided. Used if format is set to lpcm . Acceptable values include:
|
folderId | string ID of the folder that you have access to. It is required for authorization with a user account (see Authentication with the SpeechKit API). Do not use this field if you make a request on behalf of a service account. The maximum string length is 50 characters. |
Parameters in the request body
The request body has to contain the binary content of an audio file.
Response
The recognized text is returned in the response in the result
field.
{
"result": <recognized_text>
}
For more information about the response format and codes, see Response format.