Supported languages and recognition models
A recognition model is a model that is trained to recognize speech in a specific language. The models are trained on datasets generated by Yandex services and applications. This allows us to continually improve speech recognition quality.
The main supported model for each type of recognition is the general
model. It recognizes speech on any topic in a given language, including short and long utterances, names, addresses, dates, and numbers.
Version tags
Three versions of the general
model are available at the same time. You can select the desired version by tag:
general
: The main version of the model.general:rc
: The version of a release candidate that you can test.general:deprecated
: The previous version of the model.
Note
Versions available by the general:deprecated
tag stop being supported when new models are released: SpeechKit guarantees two weeks of support for the previous version after updating the version by the general
tag. You can find the list of updates in Yandex SpeechKit release notes: Speech recognition.
In addition, the deferred-general
tag is available for asynchronous recognition. Learn more about asynchronous recognition modes.
Supported recognition languages
Specify the language in ISO 639-1
Code | Language |
---|---|
auto |
Automatic language recognition |
de-DE |
German |
en-US |
English |
es-ES |
Spanish |
fi-FI |
Finnish |
fr-FR |
French |
he-HE |
Hebrew |
it-IT |
Italian |
kk-KZ |
Kazakh |
nl-NL |
Dutch |
pl-PL |
Polish |
pt-PT |
Portuguese |
pt-BR |
Brazilian Portuguese |
ru-RU |
Russian (default) |
sv-SE |
Swedish |
tr-TR |
Turkish |
uz-UZ |
Uzbek (Latin script) |
Recognition accuracy
In Yandex DataSphere, you can assess the recognition quality of a SpeechKit model yourself using your data.