Yandex Cloud
Search
Contact UsGet started
  • Blog
  • Pricing
  • Documentation
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • ML & AI
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Customer Stories
    • Gateway to Russia
    • Cloud for Startups
    • Education and Science
  • Blog
  • Pricing
  • Documentation
Yandex project
© 2025 Yandex.Cloud LLC
Yandex SpeechKit
  • SpeechKit technology overview
    • About the Python SDK
    • Installing the Python SDK
    • Resolving version conflicts during the installation of Python SDK
    • Authentication in the Python SDK
    • Synchronous recognition using the Python SDK
    • Speech synthesis using the Python SDK
  • Supported audio formats
  • IVR integration
  • Quotas and limits
  • Access management
  • Pricing policy

In this article:

  • Getting started
  • Create an application for synchronous speech recognition
  1. SDK
  2. Synchronous recognition using the Python SDK

Synchronous speech recognition using the Python SDK

Written by
Yandex Cloud
Updated at March 28, 2025
  • Getting started
  • Create an application for synchronous speech recognition

Below, we provide an example of synchronous speech recognition from an audio file using the SpeechKit Python SDK. This example uses the following parameters:

  • Recognition model: general
  • Language: Russian

To use the Python SDK, the yandex-speechkit package is required.

Authentication is performed under a service account using an API key or IAM token. Learn more about authentication in the SpeechKit API.

Getting startedGetting started

  1. Create a service account and assign the ai.speechkit-stt.user role to it.
  2. Get an API key for the service account and save it.
  3. Download a sample audio file for recognition or generate your own one.

Create an application for synchronous speech recognitionCreate an application for synchronous speech recognition

Python 3
  1. Install the yandex-speechkit package using the pip package manager:

    pip install yandex-speechkit
    

    The installation was tested on Python 3.9. For the minimum allowed Python version, see the SDK website.

    If a grpcio-tools package version conflict occurs, see Resolving version conflicts during the installation of Python SDK.

  2. Create a file named test.py and add the following code to it:

    from argparse import ArgumentParser
    from speechkit import model_repository, configure_credentials, creds
    from speechkit.stt import AudioProcessingType
    
    # Authentication with an API key.
    configure_credentials(
       yandex_credentials=creds.YandexCredentials(
          api_key='<API_key>'
       )
    )
    
    def recognize(audio):
       model = model_repository.recognition_model()
    
       # Specify the recognition settings.
       model.model = 'general'
       model.language = 'ru-RU'
       model.audio_processing_type = AudioProcessingType.Full
    
       # Recognizing speech in the specified audio file and outputting the results to the console.
       result = model.transcribe_file(audio)
       for c, res in enumerate(result):
          print('=' * 80)
          print(f'channel: {c}\n\nraw_text:\n{res.raw_text}\n\nnorm_text:\n{res.normalized_text}\n')
          if res.has_utterances():
             print('utterances:')
             for utterance in res.utterances:
                print(utterance)
    
    if __name__ == '__main__':
       parser = ArgumentParser()
       parser.add_argument('--audio', type=str, help='audio path', required=True)
    
       args = parser.parse_args()
    
       recognize(args.audio)
    

    Where:

    • api_key: Service account API key.

    • audio: Audio recording file path.

    • model: Recognition model.

    • language: Recognition language.

    • audio_processing_type: Audio processing method.

      The Python SDK does not support streaming and asynchronous recognition, but you can simulate these features. To do this, set the following value in the test.py file, the audio_processing_type parameter:

      • AudioProcessingType.Stream for streaming recognition.
      • AudioProcessingType.Full for asynchronous recognition.
  3. Run the created file:

    python3 test.py --audio speech.pcm
    

    Where --audio is the path to the audio file for recognition.

    The result contains recognized speech:

    channel: 0
    
    raw_text:
    i'm yandex speechkit i can turn any text into speech now you can too
    
    norm_text:
    I'm Yandex SpeechKit, I can turn any text into speech, now you can too.
    
    utterances:
    - I'm Yandex SpeechKit, I can turn any text into speech, now you can too. [0.419, 6.379]
    

See alsoSee also

  • Python SDK SpeechKit
  • Example of using the API v1 for synchronous recognition

Was the article helpful?

Previous
Authentication in the Python SDK
Next
Speech synthesis using the Python SDK
Yandex project
© 2025 Yandex.Cloud LLC