Yandex Cloud
Search
Contact UsGet started
  • Blog
  • Pricing
  • Documentation
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • ML & AI
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Customer Stories
    • Gateway to Russia
    • Cloud for Startups
    • Education and Science
  • Blog
  • Pricing
  • Documentation
Yandex project
© 2025 Yandex.Cloud LLC
Yandex SpeechKit
  • SpeechKit technology overview
  • Supported audio formats
  • IVR integration
  • Quotas and limits
  • Access management
  • Pricing policy

In this article:

  • What goes into the cost of using SpeechKit
  • Using speech synthesis
  • Using speech recognition
  • Prices for the Russia region
  • Speech synthesis
  • SpeechKit Brand Voice
  • Speech recognition
  • Examples of cost calculation
  • Speech synthesis using API v1
  • Speech synthesis using API v3
  • Streaming speech recognition
  • Synchronous speech recognition
  • Asynchronous speech recognition
  • Asynchronous speech recognition in deferred mode

SpeechKit pricing policy

Written by
Yandex Cloud
Updated at May 5, 2025
  • What goes into the cost of using SpeechKit
    • Using speech synthesis
    • Using speech recognition
  • Prices for the Russia region
    • Speech synthesis
    • SpeechKit Brand Voice
    • Speech recognition
  • Examples of cost calculation
    • Speech synthesis using API v1
    • Speech synthesis using API v3
    • Streaming speech recognition
    • Synchronous speech recognition
    • Asynchronous speech recognition
    • Asynchronous speech recognition in deferred mode

Tip

To calculate the cost of speech synthesis and recognition, use the calculator on the Yandex Cloud website or see the pricing data in this section.

Prices for service products are also available in the Price list.

What goes into the cost of using SpeechKitWhat goes into the cost of using SpeechKit

Using speech synthesisUsing speech synthesis

The cost of using SpeechKit for speech synthesis depends on the version of the API you use.

API v1API v1

For the API v1, the cost is calculated based on the total number of characters sent to generate speech from text in a calendar month (Reporting period).

API v3API v3

The cost of using the API v3 depends on the number of synthesis requests sent. The cost is calculated for a calendar month (Reporting period).

By default, speech synthesis requests have these limitations: 250 characters and 24 seconds. To synthesize longer phrases, you can use unsafe_mode. In this case, you will be charged per 250 characters, e.g.:

  • A request that is shorter than 250 characters is charged for as a single billing unit.
  • A request that is from 250 to 500 characters long is charged for as two billing units.
  • A request that is from 500 to 750 characters long is charged for as three billing units.

Empty requestEmpty request

The number of characters in a request is determined considering spaces and special characters. The cost of an empty request depends on the API version:

  • An empty request to the API v1 is charged for as a single character.
  • An empty request to the API v3 is charged for as a single billing unit.

Internal server errorsInternal server errors

You are not charged for a request that fails due to an internal server error.

Using speech recognitionUsing speech recognition

The cost of using SpeechKit for speech recognition depends on the recognition type and duration of a recognized audio fragment. The cost is calculated for a calendar month (Reporting period).

Streaming speech recognitionStreaming speech recognition

The cost of using SpeechKit streaming recognition is calculated based on the pricing rules for synchronous recognition.

Synchronous recognitionSynchronous recognition

These rules apply to synchronous recognition and streaming mode recognition when using the API v2 and API v3.

The billing unit is a 15-second segment of a single-channel audio file. Shorter segments are rounded up (1 second becomes 15 seconds).

Warning

In streaming mode, billing begins as soon as you send a message with recognition settings. If you do not send any audio after this message, it will be treated as one consumed billing unit.

Examples:

  • One audio fragment that is 37 seconds long is billed as 45 seconds.

    Explanation: The audio is divided into two 15-second segments and one 7-second segment. The length of the last segment is rounded up to 15 seconds. Thus, we have three segments, 15 seconds each.

  • Two audio fragments that are 5 and 8 seconds long are billed as 30 seconds.

    Explanation: The length of each audio is rounded up to 15 seconds. Thus, we have two segments, 15 seconds each.

Asynchronous recognition withAsynchronous recognition with

These rules apply when using asynchronous recognition.

The billing unit is a one-second segment of two-channel audio. Shorter segments are rounded up. The number of channels is rounded up to an even number.

The minimum billable amount is 15 seconds for every pair of channels. Shorter audio fragments are billed as 15 seconds.

Examples of rounding the length of audio

Length Number of channels Seconds charged
1 second 1 15
1 second 2 15
1 second 3 30
15.5 seconds 2 16
15.5 seconds 4 32

Empty requestEmpty request

The cost of an empty request to any type of speech recognition is equal to that of a single billing unit.

Internal server errorsInternal server errors

You are not charged for a request that fails due to an internal server error.

Prices for the Russia regionPrices for the Russia region

Note

Prices for Yandex Cloud resources vary based on the region. For more information about the available regions, see Regions.

The currency you can use to pay for the resources depends on which legal entity you entered into agreement with. For more information on creating an account, see Registering an account in Yandex Cloud.

Speech synthesisSpeech synthesis

Prices in RUB
Prices in KZT
Service Price per unit,
including VAT
Speech synthesis using API v1, for 1 million characters ₽1320.00
Speech synthesis using API v3, per request ₽0.16
Service Price per unit,
including VAT
Speech synthesis using API v1, for 1 million characters ₸6600.00
Speech synthesis using API v3, per request ₸0.80

SpeechKit Brand VoiceSpeechKit Brand Voice

Prices in RUB
Prices in KZT
Service Price per unit, including VAT
SpeechKit Brand Voice Self Service model hosting, per month ₽240,000
SpeechKit Brand Voice Premium model hosting, per month Contact us
Request to SpeechKit Brand Voice Call Center model ₽0.16
Request to SpeechKit Brand Voice Self Service model ₽0.16
Request to SpeechKit Brand Voice Premium model ₽0.16
Service Price per unit, including VAT
SpeechKit Brand Voice Self Service model hosting, per month ₸1,200,000
SpeechKit Brand Voice Premium model hosting, per month Contact us
Request to SpeechKit Brand Voice Call Center model ₸0.80
Request to SpeechKit Brand Voice Self Service model ₸0.80
Request to SpeechKit Brand Voice Premium model ₸0.80

Speech recognitionSpeech recognition

Prices in RUB
Prices in KZT
Service Price for 15 seconds of audio,
including VAT
Streaming recognition ₽0.16
Synchronous file recognition ₽0.16
Asynchronous file recognition* ₽0.15
Asynchronous file recognition, deferred mode model* ₽0.0375

* Per-second billing starts from the 16th second.

Service Price for 15 seconds of audio,
including VAT
Streaming recognition ₸0.80
Synchronous file recognition ₸0.80
Asynchronous file recognition* ₸0.75
Asynchronous file recognition, deferred mode model* ₸0.1875

* Per-second billing starts from the 16th second.

Examples of cost calculationExamples of cost calculation

Speech synthesis using API v1Speech synthesis using API v1

The cost of using SpeechKit for speech synthesis using the API v1 with the following parameters:

  • Number of characters sent per month: 2,023.
Calculating cost in RUB
Calculating cost in KZT

2,023 × (₽1320.00 / 1,000,000) = ₽2.67

Total: ₽2.67

Where:

  • ₽1320.00: Cost per one million characters.
  • ₽1320.00 / 1,000,000: Cost per one character.

2,023 × (₸6600.00 / 1,000,000) = ₸13.35

Total: ₸13.35

Where:

  • ₸6600.00: Cost per one million characters.
  • ₸6600.00 / 1,000,000: Cost per one character.

Speech synthesis using API v3Speech synthesis using API v3

The cost of using SpeechKit for speech synthesis using the API v3 with the following parameters:

  • Number of requests sent: 3.
  • Number of characters in requests: 150, 300, 600.
Calculating cost in RUB
Calculating cost in KZT

(1 + 2 + 3) × ₽0.16 = ₽0.96

Total: (1 + 2 + 3) × ₽0.16 = ₽0.96

Where:

  • 1 is the number of billing units charged for the first request of 150 characters.
  • 2 is the number of billing units charged for the second request of 300 characters made using unsafe_mode.
  • 3 is the number of billing units charged for the third request of 600 characters made using unsafe_mode.
  • ₽0.16: Cost per billing unit.

(1 + 2 + 3) × ₸0.80 = ₸4.80

Total: ₸4.80

Where:

  • 1 is the number of billing units charged for the first request of 150 characters.
  • 2 is the number of billing units charged for the second request of 300 characters made using unsafe_mode.
  • 3 is the number of billing units charged for the third request of 600 characters made using unsafe_mode.
  • ₸0.80: Cost per billing unit.

Streaming speech recognitionStreaming speech recognition

The cost of using SpeechKit for streaming speech recognition with the following parameters:

  • Number of audio fragments: 2.
  • Duration of audio fragments: 5 seconds, 37 seconds.
Calculating cost in RUB
Calculating cost in KZT

(1 + 3) × ₽0.16 = ₽0.64

Total: ₽0.64

Where:

  • 1 is the number of billing units charged for the first 5-second audio fragment rounded up to 15 seconds.
  • 3 is the number of billing units charged for the second 37-second audio fragment rounded up to 45 seconds.
  • ₽0.16: Cost per billing unit.

(1 + 3) × ₸0.80 = ₸3.20

Total: ₸3.20

Where:

  • 1 is the number of billing units charged for the first 5-second audio fragment rounded up to 15 seconds with recognition settings message considered.
  • 3 is the number of billing units charged for the second 37-second audio fragment rounded up to 45 seconds with recognition settings message considered.
  • ₸0.80: Cost per billing unit.

Synchronous speech recognitionSynchronous speech recognition

The cost of using SpeechKit for synchronous speech recognition with the following parameters:

  • Number of audio fragments: 2.
  • Duration of audio fragments: 5 seconds, 37 seconds.
Calculating cost in RUB
Calculating cost in KZT

(1 + 3) × ₽0.16 = ₽0.64

Total: ₽0.64

Where:

  • 1 is the number of billing units charged for the first 5-second audio fragment rounded up to 15 seconds.
  • 3 is the number of billing units charged for the second 37-second audio fragment rounded up to 45 seconds.
  • ₽0.16: Cost per billing unit.

(1 + 3) × ₸0.80 = ₸3.20

Total: ₸3.20

Where:

  • 1 is the number of billing units charged for the first 5-second audio fragment rounded up to 15 seconds.
  • 3 is the number of billing units charged for the second 37-second audio fragment rounded up to 45 seconds.
  • ₸0.80: Cost per billing unit.

Asynchronous speech recognitionAsynchronous speech recognition

The cost of using SpeechKit for asynchronous speech recognition with the following parameters:

  • Number of audio fragments: 4.
  • Duration of audio fragments: 5 seconds, 5 seconds, 15.5 seconds, 15.5 seconds.
  • Number of channels in audio fragments: 1, 3, 2, 4.
Calculating cost in RUB
Calculating cost in KZT

(15 + 30 + 16 + 32) × ₽0.01 = ₽0.93

Total: ₽0.93

Where:

  • 15 is the number of billing units charged for the first single-channel 5-second audio fragment rounded up to 2 channels and 15 seconds.
  • 30 is the number of billing units charged for the second 3-channel 5-second audio fragment rounded up to 4 channels and 15 seconds.
  • 16 is the number of billing units charged for the third 2-channel 15.5-second audio fragment rounded up to 16 seconds.
  • 32 is the number of billing units charged for the fourth 4-channel 15.5-second audio fragment rounded up to 16 seconds.
  • ₽0.01: Cost per billing unit.

(15 + 30 + 16 + 32) × ₸0.05 = ₸4.65

Total: ₸4.65.

Where:

  • 15 is the number of billing units charged for the first single-channel 5-second audio fragment rounded up to 2 channels and 15 seconds.
  • 30 is the number of billing units charged for the second 3-channel 5-second audio fragment rounded up to 4 channels and 15 seconds.
  • 16 is the number of billing units charged for the third 2-channel 15.5-second audio fragment rounded up to 16 seconds.
  • 32 is the number of billing units charged for the fourth 4-channel 15.5-second audio fragment rounded up to 16 seconds.
  • ₸0.05: Cost per billing unit.

Asynchronous speech recognition in deferred modeAsynchronous speech recognition in deferred mode

The cost of using SpeechKit for asynchronous speech recognition in deferred mode with the following parameters:

  • Number of audio fragments: 3.
  • Duration of audio fragments: 2 seconds, 14 seconds, 19.5 seconds.
  • Number of channels in audio fragments: 2, 3, 4.
Calculating cost in RUB
Calculating cost in KZT

(15 + 30 + 40) × ₽0.0025 = ₽0.21

Total: ₽0.21

Where:

  • 15 is the number of billing units charged for the first 2-channel 2-second audio fragment rounded up to 15 seconds.
  • 30 is the number of billing units charged for the second 3-channel 14-second audio fragment rounded up to 4 channels and 15 seconds.
  • 40 is the number of billing units charged for the third 4-channel 19.5-second audio fragment rounded up to 20 seconds.
  • ₽0.0025: Cost per billing unit.

(15 + 30 + 40) × ₸0.0125 = ₸1.06

Total: ₸1.06

Where:

  • 15 is the number of billing units charged for the first 2-channel 2-second audio fragment rounded up to 15 seconds.
  • 30 is the number of billing units charged for the second 3-channel 14-second audio fragment rounded up to 4 channels and 15 seconds.
  • 40 is the number of billing units charged for the third 4-channel 19.5-second audio fragment rounded up to 20 seconds.
  • ₸0.0125: Cost per billing unit.

Was the article helpful?

Previous
Access management
Next
Recognition releases
Yandex project
© 2025 Yandex.Cloud LLC