SpeechKit tutorials
Speech recognition
Streaming recognition
-
Audio file streaming recognition using the API v3: The example uses the Russian language, 8,000 Hz LPCM streaming audio from file, single audio channel. Profanity filter enabled in recognition settings.
-
Microphone speech streaming recognition using the API v3: The example uses the Russian language, 8,000 Hz LPCM audio, single audio channel. Profanity filter enabled.
-
Streaming speech recognition with auto language detection: The example uses 8,000 Hz LPCM audio, single audio channel.
-
Example use of streaming recognition with API v2: The example uses the Russian language, 8,000 Hz LPCM audio. Profanity and intermediate result filters enabled.
Synchronous recognition
Example use for synchronous recognition API: The example uses the Russian language and other parameters with their default values.
Asynchronous recognition
-
Asynchronously recognizing audio files in LPCM format: The example uses the Russian language, the
general:rc
language model, 8,000 Hz LPCM audio, single audio channel. -
Asynchronously recognizing audio files in OggOpus format: The example uses the Russian language and other parameters with their default values.
-
Regular asynchronous recognition of audio files from Yandex Object Storage: The example uses the Russian language and the
general
model. Speech is recognized from audio files of any supported format.
Synthesis
-
Speech synthesis in the API v3: The example uses 22,050 Hz LPCM audio, WAV container, and LUFS loudness normalization.
-
Pattern-based speech synthesis using the API v3: The example uses pattern-based synthesis for SpeechKit Brand Voice Self Service and SpeechKit Brand Voice Premium voices.
-
Pattern-based speech synthesis in SpeechKit Brand Voice Call Center: The example uses pattern-based synthesis for SpeechKit Brand Voice Call Center voices.
-
Speech synthesis in WAV format using the API v1: The example uses the Russian language, 48,000 Hz LPCM audio, WAV container, and the
filipp
voice. -
Speech synthesis in OggOpus format using the API v1: The example uses the Russian language and the
filipp
voice. -
Speech synthesis from SSML text using API v1: The example uses the Russian language and the
jane
voice.