Yandex Cloud
Search
Contact UsGet started
  • Blog
  • Pricing
  • Documentation
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • ML & AI
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Customer Stories
    • Gateway to Russia
    • Cloud for Startups
    • Education and Science
  • Blog
  • Pricing
  • Documentation
Yandex project
© 2025 Yandex.Cloud LLC
Yandex Foundation Models
  • Yandex Cloud ML SDK
  • Compatibility with OpenAI
  • Access management
  • Pricing policy
  • Public materials
  • Release notes

In this article:

  • What goes into the cost of using Yandex Foundation Models
  • Billing unit
  • Text generation
  • Text classification
  • Text vectorization
  • Work of assistants
  • Image generation
  • Internal server errors
  • Prices for the Russia region
  • Text generation
  • Text classification
  • Text vectorization
  • Image generation
  • Examples of YandexGPT Lite and YandexGPT Pro usage cost calculation
  • Calculating text generation cost
  • Calculating text vectorization cost

Yandex Foundation Models pricing policy

Written by
Yandex Cloud
Updated at May 5, 2025
  • What goes into the cost of using Yandex Foundation Models
    • Billing unit
    • Text generation
    • Text classification
    • Text vectorization
    • Work of assistants
    • Image generation
    • Internal server errors
  • Prices for the Russia region
    • Text generation
    • Text classification
    • Text vectorization
    • Image generation
  • Examples of YandexGPT Lite and YandexGPT Pro usage cost calculation
    • Calculating text generation cost
    • Calculating text vectorization cost

To estimate your service costs, see the pricing in this section.

Prices for service products are also available in the Price list.

In the management console, new users without a billing account have access to models for testing:

  • YandexGPT Lite and YandexGPT Pro: 10 free requests per hour.
  • YandexART: 10 free requests per day.

What goes into the cost of using Yandex Foundation ModelsWhat goes into the cost of using Yandex Foundation Models

Billing unitBilling unit

Foundation Models usage is detailed out in billing units. The cost of a billing unit is different for text generation and vectorization.

Text generationText generation

Text generation cost is based on the total number of prompt and response tokens and depends on the parameters of your request to the generative model. Namely, the cost depends on the following:

  • Model that gets the request.
  • The model's operating mode.

The number of prompt and response tokens for the same text may vary depending on model.

When using models in batch processing mode, there is a minimum launch cost of 200,000 tokens.

The total number of billing units is based on the overall number of prompt and response tokens and is rounded up to a whole number.

TokenizationTokenization

The use of tokenizer (TokenizerService calls and Tokenizer methods) is not charged.

Fine-tuned modelsFine-tuned models

At the Preview stage, you can fine-tune models free of charge. The use of fine-tuned models is charged according to the base model's pricing policy:

  • The use of models fine-tuned in Yandex DataSphere is charged according to the YandexGPT Pro policy.
  • The use of a fine-tuned YandexGPT Lite model is charged according to the YandexGPT Lite policy.
  • The use of a fine-tuned Llama 8B model is charged according to the Llama 8B policy.

Text classificationText classification

The cost of text classification depends on the classification model you use and the number of tokens you provide.

  • When classifying with YandexGPT Lite, a billing unit is a request of up to 1,000 tokens.
  • When classifying with YandexGPT Pro and fine-tuned classifiers, a billing unit is a request of up to 250 tokens.

Requests with less than one billing unit are rounded up to the next integer. Large texts are billed as multiple requests with rounding up.

For example, classifying a text of 770 tokens with YandexGPT Lite will be billed as a single request, i.e., as one billing unit.
The same 770-token text classified with YandexGPT Pro or a fine-tuned classifier will be billed as four requests.

Text vectorizationText vectorization

The cost of text vectorization (getting text embeddings) depends on the size of the text submitted for vectorization.

Work of assistantsWork of assistants

At the Preview stage, you can use AI Assistant API and store files free of charge; however, you will be charged for models according to the text generation rules.

Image generationImage generation

You are charged for each generation request in YandexART. The requests are not idempotent; therefore, two requests with the same settings and generation prompt are two separate requests.

Internal server errorsInternal server errors

You are not charged for a request that fails due to an internal server error.

Prices for the Russia regionPrices for the Russia region

Note

Prices for Yandex Cloud resources vary based on the region. For more information about the available regions, see Regions.

The currency you can use to pay for the resources depends on which legal entity you entered into agreement with. For more information on creating an account, see Registering an account in Yandex Cloud.

Text generationText generation

Prices in RUB
Prices in KZT
Amount Price,
including VAT
1,000 units ₽0.20
Amount Price,
including VAT
1,000 units ₸1.00

Cost of using models in synchronous and asynchronous modeCost of using models in synchronous and asynchronous mode

Prices in RUB
Prices in KZT

Model

Cost per 1,000 tokens, synchronous mode,
including VAT

Cost per 1,000 tokens, asynchronous mode,
including VAT

YandexGPT Lite

₽0.20

₽0.10

YandexGPT Pro

₽1.20

₽0.60

Model fine-tuned in DataSphere

₽1.20

₽0.60

Llama 8B

₽0.20

₽0.10

Llama 70B

₽1.20

₽0.60

Model

Cost per 1,000 tokens, synchronous mode,
including VAT

Cost per 1,000 tokens, asynchronous mode,
including VAT

YandexGPT Lite

₸1.00

₸0.50

YandexGPT Pro

₸6.00

₸3.00

Model fine-tuned in DataSphere

₸6.00

₸3.00

Llama 8B

₸1.00

₸0.50

Llama 70B

₸6.00

₸3.00

Cost of using models in batch processing modeCost of using models in batch processing mode

When using models in batch processing mode, there is a minimum launch cost of 200,000 tokens.

Prices in RUB
Prices in KZT

Model

Cost per 1,000 tokens,
batch processing mode,
including VAT

Qwen2.5 7B Instruct

₽0.10

Qwen2.5 72B Instruct

₽0.60

QwQ 32B Instruct

₽0.40

Llama-3.3-70B-Instruct

₽0.60

Llama-3.1-70B-Instruct

₽0.60

DeepSeek-R1-Distill-Llama-70B

₽0.60

Qwen2.5 32B Instruct

₽0.40

DeepSeek-R1-Distill-Qwen-32B

₽0.40

phi-4

₽0.20

Qwen2 VL 7B

₽0.10

Qwen2.5 VL 7B

₽0.10

DeepSeek 2 VL

₽0.40

DeepSeek 2 VL Tiny

₽0.10

Gemma3 1B it

₽0.10

Gemma3 4B it

₽0.10

Gemma3 12B it

₽0.20

Gemma3 27B it

₽0.40

Qwen 2.5 VL 32B Instruct

₽0.40

Qwen3-0.6B

₽0.10

Qwen3-1.7B

₽0.10

Qwen3-4B

₽0.10

Qwen3-8B

₽0.10

Qwen3-14B

₽0.20

Qwen3-32B

₽0.40

Qwen3-30B-A3B

₽0.40

Qwen3-235B-A22B

₽6.00

Model

Cost per 1,000 tokens,
batch processing mode,
including VAT

Qwen2.5 7B Instruct

₸0.50

Qwen2.5 72B Instruct

₸3.00

QwQ 32B Instruct

₸2.00

Llama-3.3-70B-Instruct

₸3.00

Llama-3.1-70B-Instruct

₸3.00

DeepSeek-R1-Distill-Llama-70B

₸3.00

Qwen2.5 32B Instruct

₸2.00

DeepSeek-R1-Distill-Qwen-32B

₸2.00

phi-4

₸1.00

Qwen2 VL 7B

₸0.50

Qwen2.5 VL 7B

₸0.50

DeepSeek 2 VL

₸2.00

DeepSeek 2 VL Tiny

₸0.50

Gemma3 1B it

₸0.50

Gemma3 4B it

₸0.50

Gemma3 12B it

₸1.00

Gemma3 27B it

₸2.00

Qwen 2.5 VL 32B Instruct

₸2.00

Qwen3-0.6B

₸0.50

Qwen3-1.7B

₸0.50

Qwen3-4B

₸0.50

Qwen3-8B

₸0.50

Qwen3-14B

₸1.00

Qwen3-32B

₸2.00

Qwen3-30B-A3B

₸2.00

Qwen3-235B-A22B

₸30.00

Text classificationText classification

Prices in RUB
Prices in KZT
Service Cost,
including VAT
1 request (1,000 tokens) to classifier based on YandexGPT Lite ₽0.15
1 request (250 tokens) to classifier based on YandexGPT Pro ₽0.15
1 request (250 tokens) to tuned classifier ₽0.15
Service Cost,
including VAT
1 request (1,000 tokens) to classifier based on YandexGPT Lite ₸0.75
1 request (250 tokens) to classifier based on YandexGPT Pro ₸0.75
1 request (250 tokens) to tuned classifier ₸0.75

Text vectorizationText vectorization

Prices in RUB
Prices in KZT
Amount Cost,
including VAT
1,000 units ₽0.01
Amount Price,
including VAT
1,000 units ₸0.05
Prices in RUB
Prices in KZT
Model parameters Number of units
per token
Total cost of processing 1,000 tokens,
including VAT
Getting text embeddings 1 ₽0.01
Model parameters Number of units
per token
Cost of processing 1,000 tokens,
including VAT
Getting text embeddings 1 ₸0.05

Image generationImage generation

Prices in RUB
Prices in KZT
Service Cost,
including VAT
1 request for YandexART image generation ₽2.20
Service Cost,
including VAT
1 request for YandexART image generation ₸11.00

Examples of YandexGPT Lite and YandexGPT Pro usage cost calculationExamples of YandexGPT Lite and YandexGPT Pro usage cost calculation

Calculating text generation costCalculating text generation cost

Example 1Example 1

Cost of using YandexGPT Lite for text generation with the following parameters:

  • Number of prompt tokens: 225
  • Number of response tokens: 525
  • Model: YandexGPT Lite
  • Model working mode: Synchronous
Calculating cost in RUB
Calculating cost in KZT
  • Number of prompt and response tokens: 225 + 525 = 750
  • Number of units per token for the YandexGPT Lite model, synchronous mode: 1
  • Total number of units in usage details: 750

(₽0.20 / 1,000 units) × 750 units = ₽0.15

  • Number of prompt and response tokens: 225 + 525 = 750
  • Number of units per token for the YandexGPT Lite model, synchronous mode: 1
  • Total number of units in usage details: 750

(₸1.00 / 1,000 units) × 750 units = ₸0.75

Example 2Example 2

Cost of using YandexGPT Pro for text generation with the following parameters:

  • Number of prompt tokens: 115
  • Number of response tokens: 1,500
  • Model: YandexGPT Pro
  • Model working mode: Asynchronous
Calculating cost in RUB
Calculating cost in KZT
  • Number of prompt and response tokens: 115 + 1,500 = 1,615
  • Price per 1,000 tokens for the YandexGPT Pro model, asynchronous mode: ₽0.60
  • Number of units per token for the YandexGPT Pro model, asynchronous mode: 3
  • Total number of units in usage details: 1,615 × 3 = 4,845

Total: (₽0.60 / 1,000 tokens) × 1,615 tokens = ₽0.969 rounded to ₽0.97.

  • Number of prompt and response tokens: 115 + 1,500 = 1,615
  • Price per 1,000 tokens for the YandexGPT Pro model, asynchronous mode: ₸3.00
  • Number of units per token for the YandexGPT Pro model, asynchronous mode: 3
  • Total number of units in usage details: 1,615 × 3 = 4,845

Total: (₸3.00 / 1,000 tokens) × 1,615 tokens = ₸4.845 rounded to ₸4.85

Example 3Example 3

Cost of using YandexGPT Pro and DataSphere for text generation with the following parameters:

  • Number of prompt tokens: 1,020
  • Number of response tokens: 30
  • Model: YandexGPT Pro fine-tuned in DataSphere
  • Model working mode: Synchronous
Calculating cost in RUB
Calculating cost in KZT
  • Number of prompt and response tokens: 1,020 + 30 = 1,050
  • Price per 1,000 tokens for the model fine-tuned in DataSphere, synchronous mode: ₽1.20
  • Number of units per token for the model fine-tuned in DataSphere, synchronous mode: 6
  • Total number of units in usage details: 1,050 × 6 = 6,300

Total: (₽0.20 / 1,000 units) × 6,300 units = ₽1.26 or (₽1.20 / 1,000 tokens) × 1,050 tokens = ₽1.26.

  • Number of prompt and response tokens: 1,020 + 30 = 1,050
  • Price per 1,000 tokens for the model fine-tuned in DataSphere, synchronous mode: ₸6.00
  • Number of units per token for the model fine-tuned in DataSphere, synchronous mode: 6
  • Total number of units in usage details: 1,050 × 6 = 6,300

Total: (₸1.00 / 1,000 units) × 6,300 units = ₸6.30 or (₸6.00 / 1,000 tokens) × 1,050 tokens = ₸6.30.

Calculating text vectorization costCalculating text vectorization cost

Cost of using Yandex Foundation Models for text vectorization with the following parameter:

  • Number of tokens in the request: 2,000
Calculating cost in RUB
Calculating cost in KZT
  • ₽0.01: Cost for processing 1,000 tokens
  • ₽0.01 / 1,000: Cost for processing one token

2,000 × (₽0.01 / 1,000) = ₽0.02

Total: ₽0.02.

  • ₸0.05: Cost for processing 1,000 tokens
  • ₸0.05 / 1,000: Cost for processing one token

2,000 × (₸0.05 / 1,000) = ₸0.10

Total: ₸0.10.

Was the article helpful?

Previous
Access management
Next
Public materials
Yandex project
© 2025 Yandex.Cloud LLC