Yandex Cloud
Search
Contact UsGet started
  • Blog
  • Pricing
  • Documentation
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • ML & AI
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Customer Stories
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Education and Science
    • Yandex Cloud Partner program
  • Blog
  • Pricing
  • Documentation
© 2025 Direct Cursus Technology L.L.C.
Yandex SpeechKit
  • SpeechKit technology overview
    • About the technology
    • Supported languages
    • Streaming recognition
    • Recognition result normalization
    • Analyzing recognition results
    • Speaker labeling
    • Extending a speech recognition model
    • Uploading fine-tuning data for a speech recognition model
    • Detecting the end of utterance
  • Supported audio formats
  • IVR integration
  • Quotas and limits
  • Access management
  • Pricing policy

In this article:

  • Version tags
  • Supported recognition languages
  • Automatic language detection
  • Recognition accuracy
  • Use cases
  1. Speech recognition
  2. Supported languages

Supported languages and recognition models

Written by
Yandex Cloud
Improved by
amatol
Updated at April 30, 2025
  • Version tags
  • Supported recognition languages
  • Automatic language detection
  • Recognition accuracy
  • Use cases

A recognition model is a model trained to recognize speech in a particular language. The models are trained on datasets generated by Yandex services and applications. This allows us to continually improve speech recognition quality.

The main supported model for each recognition type is the general model. It recognizes speech on any topic in a given language, including short and long utterances, names, addresses, dates, and numbers.

Version tagsVersion tags

Three versions of the general model can be available at the same time. You can select the one you need by tags:

  • general: The main version of the model.
  • general:rc: The release candidate version available for testing.
  • general:deprecated: The previous version of the model.

Note

We stop supporting the general:deprecated tag versions as new models are released: SpeechKit guarantees two weeks of support for the previous version after we update the general tag version. You can find the list of updates in Yandex SpeechKit release notes: Speech recognition.

You can also use the deferred-general tag for asynchronous recognition with the API v2. Learn more about asynchronous recognition modes.

Supported recognition languagesSupported recognition languages

Use a recognition language code from the table below. All available code values are case insensitive.

Code Language
auto Automatic language recognition
de-DE German
en-US English
es-ES Spanish
fi-FI Finnish
fr-FR French
he-IL Hebrew
it-IT Italian
kk-KZ Kazakh
nl-NL Dutch
pl-PL Polish
pt-PT Portuguese
pt-BR Brazilian Portuguese
ru-RU Russian (default)
sv-SE Swedish
tr-TR Turkish
uz-UZ Uzbek (Latin script)

Automatic language detectionAutomatic language detection

SpeechKit automatically detects language in each sentence during speech recognition.

To configure automatic language detection, set the language_code parameter of the LanguageRestrictionOptions() method to auto:

Python 3
language_restriction=stt_pb2.LanguageRestrictionOptions(
      restriction_type=stt_pb2.LanguageRestrictionOptions.WHITELIST,
      language_code=['auto']
)

Along with recognition results, the service returns language labels containing the language code and probability of its correct detection:

language_code: "ru-RU" probability: 0.91582357883453369

If a sentence contains words in different languages, the language may be detected incorrectly. To improve results, provide a list of expected languages as a clue for the model. Here is an example:

Python 3
...
      language_code=['auto', 'en-US', 'es-ES', 'fr-FR']
...

Note

Language detection and setting language labels are only available in gRPC API v3.

Examples

Text in audio Transcript
Xiaomi is a Chinese brand shumi is a chinese brand
Привет is hi in Russian privet is hi in russian
Men koʻchada sayr qilishni va muzqaymoq isteʼmol qilishni yaxshi koʻraman, I like to take a walk outside and have some ice cream Men koʻchada sayr qilishni va muzqaymoq isteʼmol qilishni yaxshi koʻraman, I like to take a walk outside and have some ice cream

Recognition accuracyRecognition accuracy

In Yandex DataSphere, you can assess the recognition quality of a SpeechKit model yourself using your data.

Use casesUse cases

  • Developing a Telegram bot for text recognition in images, audio synthesis and recognition
  • Streaming speech recognition with auto language detection in the API v3
  • Audio file streaming recognition using the API v3
  • Microphone speech streaming recognition using the API v3

See alsoSee also

  • Supported audio formats
  • Streaming speech recognition
  • Synchronous audio recognition
  • Asynchronous recognition
  • Extending a speech recognition model
  • Streaming speech recognition with auto language detection in the API v3

Was the article helpful?

Previous
About the technology
Next
Streaming recognition
© 2025 Direct Cursus Technology L.L.C.