Yandex Cloud
Search
Contact UsGet started
  • Pricing
  • Customer Stories
  • Documentation
  • Blog
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • AI for business
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Start testing with double trial credits
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Center for Technologies and Society
    • Yandex Cloud Partner program
  • Pricing
  • Customer Stories
  • Documentation
  • Blog
© 2025 Direct Cursus Technology L.L.C.
Yandex AI Studio
    • About Yandex AI Studio
    • Yandex Workflows
    • Quotas and limits
    • Terms and definitions
  • Compatibility with OpenAI
  • Access management
  • Pricing policy
  • Audit Trails events
  • Public materials
  • Release notes

In this article:

  • What goes into the cost of using Yandex AI Studio
  • Text generation
  • Dedicated instances
  • Text classification
  • Text vectorization
  • Using assistants and text agents
  • Using voice agents
  • Image generation
  • MCP Hub
  • Internal server errors
  • Prices for the Russia region
  • Text generation
  • Text classification
  • Text vectorization
  • Using voice agents
  • Image generation
  • Examples of the YandexGPT Lite and YandexGPT Pro usage cost calculation
  • Calculating the text generation cost
  • Calculating the text vectorization cost
  • Voice agent cost calculation example

Yandex AI Studio pricing policy

Written by
Yandex Cloud
Updated at November 7, 2025
  • What goes into the cost of using Yandex AI Studio
    • Text generation
    • Dedicated instances
    • Text classification
    • Text vectorization
    • Using assistants and text agents
    • Using voice agents
    • Image generation
    • MCP Hub
    • Internal server errors
  • Prices for the Russia region
    • Text generation
    • Text classification
    • Text vectorization
    • Using voice agents
    • Image generation
  • Examples of the YandexGPT Lite and YandexGPT Pro usage cost calculation
    • Calculating the text generation cost
    • Calculating the text vectorization cost
  • Voice agent cost calculation example

To estimate your service costs, see the pricing in this section.

The prices for service products are also available in the price list.

Note

Currency of Service rates (prices) depends on the company you made a contract with:

  • Prices in US dollars are applicable to customers of Iron Hive doo Beograd (Serbia) or Direct Cursus Technology L.L.C. (Dubai).
  • Prices in Russian roubles are applicable to customers of Yandex.Cloud LLC.

All prices below do not include VAT.

What goes into the cost of using Yandex AI StudioWhat goes into the cost of using Yandex AI Studio

In Yandex Cloud Billing, AI Studio usage is detailed in billing units. The billing unit value is different for generation, vectorization, and dedicated instances.

Text generationText generation

Text generation cost is based on the total number of prompt and response tokens and depends on the parameters of your request to the generative model. Namely, the cost depends on the following:

  • Model that gets the request.
  • The model's operating mode.

The number of prompt and response tokens for the same text may vary depending on the model.

With models in batch mode, the minimum cost per run is 200,000 tokens.

The total number of billing units is based on the overall number of prompt and response tokens and is rounded up to an integer.

TokenizationTokenization

The use of tokenizer (TokenizerService calls and Tokenizer methods) is free of charge.

Fine-tuned modelsFine-tuned models

At the Preview stage, you can fine-tune models free of charge. The use of fine-tuned models is charged according to the base model's pricing policy: the use of a fine-tuned YandexGPT Lite model is charged according to the YandexGPT Lite pricing policy.

Dedicated instancesDedicated instances

The cost of operation of a dedicated instance depends on the model and selected configuration. Dedicated instances are charged per second of operation, rounded up to the billing unit. At the same time, there is no charge for the period of hardware maintenance and model deployment.

Prices are shown for 1 hour of use. Billing occurs per second.

Text classificationText classification

The cost of text classification depends on the classification model you use and the number of tokens you provide.

  • When classifying with YandexGPT Lite, a billing unit is a request of up to 1,000 tokens.
  • When classifying with YandexGPT Pro and fine-tuned classifiers, a billing unit is a request of up to 250 tokens.

Requests with less than one billing unit are rounded up to the next integer. Large texts are billed as multiple requests with rounding up.

For example, classifying a text of 770 tokens with YandexGPT Lite will be billed as a single request, i.e., as one billing unit.
The same 770-token text classified with YandexGPT Pro or a fine-tuned classifier will be billed as four requests.

Text vectorizationText vectorization

The cost of text vectorization (getting text embeddings) depends on the size of the text submitted for vectorization.

Using assistants and text agentsUsing assistants and text agents

You can use the AI Assistant API, Responses API and store files free of charge; however, you will be charged for models according to the text generation rules.

Note

Prices for using Web Search tool with text and voice agents will come into effect on November 20, 2025.

Service Price per 1,000 requests,
without VAT
Web Search Tool $7.5

Using voice agentsUsing voice agents

The cost of using voice agents consists of your fees for speech recognition (input audio), speech synthesis (output audio), and text generation using the speech-realtime-250923.

Image generationImage generation

You are charged for each generation request in YandexART. Requests are not idempotent; therefore, two requests with the same settings and generation prompt are two separate requests.

MCP HubMCP Hub

Note

This feature is at the Preview stage.

At the Preview stage, MCP servers are free of charge. However, you may still be charged for tools created in MCP servers. For example, Yandex Cloud Functions function invocations.

When using external APIs, such as Kontur.Focus, or amoCRM, you pay directly to the partner.

Internal server errorsInternal server errors

You are not charged for a request that fails due to an internal server error.

Prices for the Russia regionPrices for the Russia region

Note

Yandex Cloud resources are priced differently in different regions. For more information about the available regions, see Regions.

Your payment currency is determined by your contracting legal entity. For more information on creating an account, see Registering an account in Yandex Cloud.

Text generationText generation

Number Price,
without VAT
1,000 units $0.001667

Cost of using models in synchronous and asynchronous modeCost of using models in synchronous and asynchronous mode

Model

Price per 1,000 tokens, synchronous mode,
without VAT

Price per 1,000 tokens, asynchronous mode,
without VAT

YandexGPT Lite

$0.001667

$0.000834

YandexGPT Pro

$0.010002

$0.010000

YandexGPT Pro 5.1

$0.003334 1

$0.001667 1

Qwen3 235B

$0.004168 1

—

gpt-oss-120b

$0.002501

—

gpt-oss-20b

$0.000834

—

Gemma3 27B

$0.003334 1

—

1 The price is based on the current 50% discount.

Cost of using models in batch modeCost of using models in batch mode

With models in batch mode, the minimum cost per run is 200,000 tokens.

Model

Price per 1,000 tokens,
batch processing mode,
without VAT

Qwen2.5 7B Instruct

$0.000834

Qwen2.5 72B Instruct

$0.005001

QwQ 32B Instruct

$0.003334

Llama-3.3-70B-Instruct

$0.005001

Llama-3.1-70B-Instruct

$0.005001

DeepSeek-R1-Distill-Llama-70B

$0.005001

Qwen2.5 32B Instruct

$0.003334

DeepSeek-R1-Distill-Qwen-32B

$0.003334

phi-4

$0.001667

Qwen2 VL 7B

$0.000834

Qwen2.5 VL 7B

$0.000834

DeepSeek 2 VL

$0.003334

DeepSeek 2 VL Tiny

$0.000834

Gemma3 1B it

$0.000834

Gemma3 4B it

$0.000834

Gemma3 12B it

$0.001667

Gemma3 27B it

$0.003334

Qwen 2.5 VL 32B Instruct

$0.003334

Qwen3-0.6B

$0.000834

Qwen3-1.7B

$0.000834

Qwen3-4B

$0.000834

Qwen3-8B

$0.000834

Qwen3-14B

$0.001667

Qwen3-32B

$0.003334

Qwen3-30B-A3B

$0.003334

Qwen3-235B-A22B

$0.050010

Dedicated instancesDedicated instances

Prices are shown for 1 hour of use. Billing occurs per second.

The price per 1 unit for a dedicated instance is $0.0083333 without VAT.

Model Price per 1 hour,
S configuration,
without VAT
Price per 1 hour,
M configuration,
without VAT
Price per 1 hour
L configuration,
without VAT
Qwen 2.5 VL 32B Instruct $6.70 $13.40 $20.10
Qwen 2.5 72B Instruct $6.70 $13.40 $20.10
Gemma 3 4B it $3.35 $6.70 $10.05
Gemma 3 12B it $3.35 $6.70 $10.05
gpt-oss-20b $3.35 $6.70 $10.05
gpt-oss-120b $6.70 $13.40 $20.10
T-pro-it-2.0-FP8 $6.20 $12.40 $18.60

Text classificationText classification

Service Price,
without VAT
1 request (1,000 tokens) to classifier based on YandexGPT Lite $0.001250
1 request (250 tokens) to classifier based on YandexGPT Pro $0.001250
1 request (250 tokens) to tuned classifier $0.001250

Text vectorizationText vectorization

Number Price,
without VAT
1,000 units $0.000083

Model parameters Number of units per token Price per 1,000 tokens,
without VAT
Embeddings 1 $0.000083

Using voice agentsUsing voice agents

Service Price per unit of tariffing,
including VAT
Incoming audio, per 1 second $0.000216 1
Outgoing audio, per 1 second $0.00166 1
Text generation, per 1000 tokens $0.006668 1

1 A 50% discount is valid until November 1, 2025. The price is shown without the discount applied.

Image generationImage generation

Service Price,
without VAT
1 request for YandexART image generation $0.018333

Examples of the YandexGPT Lite and YandexGPT Pro usage cost calculationExamples of the YandexGPT Lite and YandexGPT Pro usage cost calculation

Calculating the text generation costCalculating the text generation cost

Example 1Example 1

Cost of using YandexGPT Lite for text generation with the following properties:

  • Number of prompt tokens: 225
  • Number of response tokens: 525
  • Model: YandexGPT Lite
  • Model working mode: Synchronous

Total: ($0.001667 / 1,000 units) × 750 units = $0.001250

Example 2Example 2

Cost of using YandexGPT Pro for text generation with the following properties:

  • Number of prompt tokens: 115
  • Number of response tokens: 1,500
  • Model: YandexGPT Pro
  • Model working mode: Asynchronous

The cost is calculated as follows:

  • Number of prompt and response tokens: 115 + 1,500 = 1,615.
  • Price per 1,000 tokens for the YandexGPT Pro model, asynchronous mode: $0.005001.
  • Number of units per token for the YandexGPT Pro model, asynchronous mode: 3.
  • Total number of units in usage details: 1,615 × 3 = 4,845.

Total: ($0.005001 / 1,000 tokens) × 1,615 tokens = $0.008077.

Example 3Example 3

Cost of using YandexGPT Pro and DataSphere for text generation with the following properties:

  • Number of prompt tokens: 1,020
  • Number of response tokens: 30
  • Model: YandexGPT Pro fine-tuned in DataSphere
  • Model working mode: Synchronous

The cost is calculated as follows:

  • Number of tokens in prompt and response: 1,020 + 30 = 1,050.
  • Price per 1,000 tokens for the model fine-tuned in DataSphere, in synchronous mode: $0.010002.
  • Number of units per token for a model fine-tuned in DataSphere, synchronous mode: 6.
  • Total number of units in usage details: 1,050 × 6 = 6,300.

Total: ($0.001667 / 1,000 units) × 6,300 units = $0.010502 or ($0.010002 / 1,000 tokens) × 1,050 tokens = $0.010502.

Calculating the text vectorization costCalculating the text vectorization cost

Cost of using Yandex AI Studio for text vectorization with the following property:

  • Number of tokens per request: 2,000
  • $0.000083: Cost of processing 1,000 tokens.
  • $0.000083 / 1,000: Cost of processing one token.

2,000 × ($0.000083 / 1,000) = $0.000166

Total: $0.000166.

Voice agent cost calculation exampleVoice agent cost calculation example

Cost of using the speech-realtime-250923 voice agent within a 60-second session:

  • Input audio: 60 seconds
  • Output audio: 20 seconds
  • Number of tokens: 2,000

$0.006668 × 2 + $0.000216 × 60 + $0.00166 × 20 = $0.013336 + $0.012960 + $0.033200

Total: $0.059496.

Where:

  • $0.006668: Cost of processing 1,000 tokens.
  • $0.006668 × 2: Cost of processing 2,000 tokens.
  • $0.000216: Cost of processing one second of incoming audio.
  • $0.000216 × 60: Cost of processing 60 seconds of incoming audio.
  • $0.00166: Cost of processing one second of outgoing audio.
  • $0.00166 × 20: Cost of processing 20 seconds of outgoing audio.

Was the article helpful?

Previous
Access management
Next
Audit Trails events
© 2025 Direct Cursus Technology L.L.C.