Yandex AI Studio pricing policy
To estimate your service costs, see the pricing in this section.
The prices for service products are also available in the price list.
Note
Currency of Service rates (prices) depends on the company you made a contract with:
- Prices in US dollars are applicable to customers of Iron Hive doo Beograd (Serbia) or Direct Cursus Technology L.L.C. (Dubai).
- Prices in Russian roubles are applicable to customers of Yandex.Cloud LLC.
All prices below do not include VAT.
Prices for the Russia region
Note
Yandex Cloud resources are priced differently in different regions. For more information about the available regions, see Regions.
Your payment currency is determined by your contracting legal entity. For more information on creating an account, see Registering an account in Yandex Cloud.
Model Gallery
The cost of using Model Gallery models depends on:
- Model's operating mode.
- Number of input and output tokens. The token count of the same text may vary from one model to the next.
Yandex Cloud Billing represents the use of Model Gallery models in billing units. The total number of billing units is rounded up to an integer.
| Number | Price, without VAT |
|---|---|
| 1,000 units | $0.001667 |
Using common instance models
|
Model |
Price per 1,000 input tokens, synchronous mode,without VAT |
Price per 1,000 output tokens, synchronous mode,without VAT |
Price per 1,000 input tokens, asynchronous mode,without VAT |
Price per 1,000 output tokens, asynchronous mode,without VAT |
|
Alice AI LLM |
$0.004168 |
$0.016670 |
$0.002084 |
$0.008335 |
|
YandexGPT Pro 5.1 |
$0.003334 1 |
$0.003334 1 |
$0.001667 1 |
$0.001667 1 |
|
YandexGPT Pro 5 |
$0.010002 |
$0.010002 |
$0.005001 |
$0.005001 |
|
YandexGPT Lite |
$0.001667 |
$0.001667 |
$0.000834 |
$0.000834 |
|
Qwen3 235B |
$0.004168 1 |
$0.004168 1 |
— |
— |
|
gpt-oss-120b |
$0.002501 |
$0.002501 |
— |
— |
|
gpt-oss-20b |
$0.000834 |
$0.000834 |
— |
— |
|
Gemma3 27B |
$0.003334 1 |
$0.003334 1 |
— |
— |
1 The price is based on the current 50% discount.
Cost calculation for a model in synchronous mode
Request parameters:
- Number of prompt tokens: 225
- Number of response tokens: 525
- Model: YandexGPT Lite
- Model operating mode: Synchronous
Total: ($0.001667 / 1,000 units) × 750 units = $0.001250
Cost calculation for a model in asynchronous mode
Request parameters:
- Number of prompt tokens: 115
- Number of response tokens: 1,500
- Model: YandexGPT Pro
- Model operating mode: Asynchronous
The cost is calculated as follows:
- Number of prompt and response tokens: 115 + 1,500 = 1,615.
- Price per 1,000 tokens for the YandexGPT Pro model, asynchronous mode: $0.005001.
- Number of units per token for the YandexGPT Pro model, asynchronous mode: 3.
- Total number of units in usage details: 1,615 × 3 = 4,845.
Total: ($0.005001 / 1,000 tokens) × 1,615 tokens = $0.008077.
Using models in batch mode
With models in batch mode, the minimum cost per run is 200,000 tokens.
|
Model |
Price per 1,000 tokens,batch processing mode,without VAT |
|
Qwen2.5 7B Instruct |
$0.000834 |
|
Qwen2.5 72B Instruct |
$0.005001 |
|
QwQ 32B Instruct |
$0.003334 |
|
Llama-3.3-70B-Instruct |
$0.005001 |
|
Llama-3.1-70B-Instruct |
$0.005001 |
|
DeepSeek-R1-Distill-Llama-70B |
$0.005001 |
|
Qwen2.5 32B Instruct |
$0.003334 |
|
DeepSeek-R1-Distill-Qwen-32B |
$0.003334 |
|
phi-4 |
$0.001667 |
|
Qwen2 VL 7B |
$0.000834 |
|
Qwen2.5 VL 7B |
$0.000834 |
|
DeepSeek 2 VL |
$0.003334 |
|
DeepSeek 2 VL Tiny |
$0.000834 |
|
Gemma3 1B it |
$0.000834 |
|
Gemma3 4B it |
$0.000834 |
|
Gemma3 12B it |
$0.001667 |
|
Gemma3 27B it |
$0.003334 |
|
Qwen 2.5 VL 32B Instruct |
$0.003334 |
|
Qwen3-0.6B |
$0.000834 |
|
Qwen3-1.7B |
$0.000834 |
|
Qwen3-4B |
$0.000834 |
|
Qwen3-8B |
$0.000834 |
|
Qwen3-14B |
$0.001667 |
|
Qwen3-32B |
$0.003334 |
|
Qwen3-30B-A3B |
$0.003334 |
|
Qwen3-235B-A22B |
$0.050010 |
Operation of dedicated instances
The cost of operation of a dedicated instance depends on the model and selected configuration. Dedicated instances are charged per second with rounding up to a billing unit. However, there is no charge for hardware maintenance and model deployment time.
Prices are shown for 1 hour of use. Billing occurs per second.
The price per 1 unit for a dedicated instance is $0.0083333 without VAT.
| Model | Price per 1 hour,S configuration, without VAT |
Price per 1 hour,M configuration, without VAT |
Price per 1 hourL configuration, without VAT |
|---|---|---|---|
| Qwen 2.5 VL 32B Instruct | $6.70 | $13.40 | $20.10 |
| Qwen 2.5 72B Instruct | $6.70 | $13.40 | $20.10 |
| Gemma 3 4B it | $3.35 | $6.70 | $10.05 |
| Gemma 3 12B it | $3.35 | $6.70 | $10.05 |
| T-pro-it-2.0-FP8 | $6.20 | $12.40 | $18.60 |
Fine-tuning
At the Preview stage, you can fine-tune models free of charge. A fine-tuned YandexGPT Lite model will cost the same as the basic YandexGPT Lite model.
Text tokenization
The use of tokenizer (TokenizerService calls and Tokenizer methods) is free of charge.
Text vectorization
The cost of text vectorization (getting text embeddings) depends on the size of the text submitted for vectorization. Yandex Cloud Billing breaks down the creation of embeddings in vectorization units.
| Number | Price, without VAT |
|---|---|
| 1,000 units | $0.000083 |
| Model parameters | Number of units per token | Price per 1,000 tokens, without VAT |
|---|---|---|
| Embeddings | 1 | $0.000083 |
Cost calculation for text vectorization
The cost of vectorizing a text of 2,000 tokens will be:
- $0.000083: Cost of processing 1,000 tokens.
- $0.000083 / 1,000: Cost of processing one token.
2,000 × ($0.000083 / 1,000) = $0.000166
Total: $0.000166.
Text classifications
The cost of text classification depends on the classification model you use and the number of tokens you provide.
- When classifying with YandexGPT Lite, a billing unit is a request of up to 1,000 tokens.
- When classifying with YandexGPT Pro and fine-tuned classifiers, a billing unit is a request of up to 250 tokens.
Requests with less than one billing unit are rounded up to the next integer. Large texts are billed as multiple requests with rounding up.
For example, classifying a text of 770 tokens with YandexGPT Lite will be billed as a single request, i.e., as one billing unit.
The same 770-token text classified with YandexGPT Pro or a fine-tuned classifier will be billed as four requests.
| Service | Price, without VAT |
|---|---|
| 1 request (1,000 tokens) to classifier based on YandexGPT Lite | $0.001250 |
| 1 request (250 tokens) to classifier based on YandexGPT Pro | $0.001250 |
| 1 request (250 tokens) to tuned classifier | $0.001250 |
Image generation
You are charged for each generation request in YandexART. Requests are not idempotent; therefore, two requests with the same settings and generation prompt are two separate requests.
| Service | Price, without VAT |
|---|---|
| 1 request for YandexART image generation | $0.018333 |
Agent Atelier
Using assistants and text agents
The AI Assistant API, Responses API, and the storage of files and search indexes are free of charge. For tokens you pay as per Model Gallery model pricing plans.
Using voice agents
The cost of using voice agents consists of your fees for speech recognition (input audio), speech synthesis (output audio), and text generation using the speech-realtime-250923.
| Service | Price per unit of tariffing, including VAT |
|---|---|
| Incoming audio, per 1 second | $0.000217 |
| Outgoing audio, per 1 second | $0.000167 |
| Text generation, per 1000 tokens | $0.006668 1 |
Cost calculation for a voice agent
Cost of using a voice agent per a 60-second session, where:
- Input audio: 60 seconds
- Output audio: 20 seconds
- Number of generated tokens: 2,000
$0.006668 × 2 + $0.000216 × 60 + $0.00166 × 20 = $0.013336 + $0.012960 + $0.033200
Total: $0.059496.
Where:
- $0.006668: Cost of processing 1,000 tokens.
- $0.006668 × 2: Cost of processing 2,000 tokens.
- $0.000217: Cost of processing 1 second of incoming audio.
- $0.000217 × 60: Cost of processing 60 seconds of incoming audio.
- $0.000167: Cost of processing 1 second of outgoing audio.
- $0.000167 × 20: Cost of processing 20 seconds of outgoing audio.
Using tools in agents
The AI Assistant API tools are free of charge.
The File Search tool in text and voice agents is free of charge.
The Web Search tool in text and voice agents is free of charge until November 17, 2025.
| Service | Price per 1,000 requests, without VAT |
|---|---|
| Web Search Tool | $7.5 |
MCP Hub
Note
This feature is at the Preview stage.
At the Preview stage, MCP servers are free of charge. However, you may still be charged for tools created in MCP servers. For example, Yandex Cloud Functions function invocations.
When using external APIs, such as Kontur.Focus, or amoCRM, you pay directly to the partner.
Internal server errors
You are not charged for a request that fails due to an internal server error.