Yandex Cloud
Search
Contact UsTry it for free
  • Customer Stories
  • Documentation
  • Blog
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • AI for business
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Start testing with double trial credits
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Center for Technologies and Society
    • Yandex Cloud Partner program
    • Price calculator
    • Pricing plans
  • Customer Stories
  • Documentation
  • Blog
© 2025 Direct Cursus Technology L.L.C.
Yandex SpeechKit
  • SpeechKit technology overview
    • Speech recognition using Playground
    • Speech synthesis using Playground
  • Supported audio formats
  • IVR integration
  • Quotas and limits
  • Access management
  • Pricing policy
  • Audit Trails events
  1. Step-by-step guides
  2. Speech recognition using Playground

Speech recognition using Playground

Written by
Yandex Cloud
Updated at December 12, 2025

To recognize speech from an audio file via the SpeechKit Playground:

Management console
  1. In the management console, select the folder you are going to use to work with SpeechKit.

  2. Go to SpeechKit.

  3. In the left-hand panel, select SpeechKit Playground.

  4. Navigate to the Speech recognition tab.

  5. Under Recognition parameters:

    • Language: Select the language or leave Automatic.
    • Text normalization: Presents dates and times in numerical format, converts numbers from text to digits, and provides access to additional settings.
    • Profanity filter: Masks profanity.
    • Literature text: Adds capital letters and punctuation marks.
    • Speaker recognition: Attributes each recognized phrase to a particular speaker.
    • Grouping speaker phrases: Divides phrases into two groups by speaker.
  6. Click Select file or drag the audio file to the loading area.

    Tip

    Convert the file to a supported audio format beforehand: MP3, WAV, or OGG with the OPUS audio codec. Maximum file size: 60 MB.

  7. Classifiers: Finds phrases of a given category in the text, e.g., greetings, negative or obscene language. This works only for Russian.

  8. Result processing: Processing of results with the help of an LLM:

    • Model: Select a model for processing. The processing cost depends on the model you select.
    • Instructions:
      • Enter a prompt in the input field or select a ready-made one.
      • Result format: Specify your preferred recognition result format.
      • Add instructions: Add another instruction. You can add up to five instructions in total.
  9. Click Start recognition to start speech recognition for the audio file.

    Recognition may take from a few seconds to a few minutes depending on the audio file size.

  10. Click View code to get the request code for Python REST or Python gRPC.

screen

SpeechKit Playground features basic speech recognition options. For more flexible recognition settings, use the API.

Was the article helpful?

Previous
How to synthesize speech in the API v3
Next
Speech synthesis using Playground
© 2025 Direct Cursus Technology L.L.C.