Yandex Cloud
Search
Contact UsGet started
  • Blog
  • Pricing
  • Documentation
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • ML & AI
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Customer Stories
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Education and Science
    • Yandex Cloud Partner program
  • Blog
  • Pricing
  • Documentation
© 2025 Direct Cursus Technology L.L.C.
Yandex Foundation Models
    • About Yandex Foundation Models
      • Overview
      • Models
      • Tokens
      • Function calling
      • Reasoning mode
    • Embeddings
    • Datasets
    • Fine-tuning
    • Quotas and limits
  • Yandex Cloud ML SDK
  • Compatibility with OpenAI
  • Access management
  • Pricing policy
  • Public materials
  • Release notes

In this article:

  • Generation models
  • Model lifecycle
  • Accessing models
  1. Concepts
  2. Text generation
  3. Models

Text generation models

Written by
Yandex Cloud
Updated at April 21, 2025
  • Generation models
  • Model lifecycle
  • Accessing models

Yandex Foundation Models provides access to large text models from different vendors. If an out-of-the-box model is not enough, you can fine-tune some models to respond to your requests more accurately.

Generation modelsGeneration models

All basic models are subject to the update rules described in Model lifecycle. When updating models, generations available in different branches (/latest, /rc, and /deprecated segments) may change.

URI

Generation

Operating modes

YandexGPT Lite

gpt://<folder_ID>/yandexgpt-lite/deprecated
gpt://<folder_ID>/yandexgpt-lite/latest
gpt://<folder_ID>/yandexgpt-lite/rc

4
4
5

Asynchronous, synchronous

YandexGPT Pro

gpt://<folder_ID>/yandexgpt/deprecated
gpt://<folder_ID>/yandexgpt/latest
gpt://<folder_ID>/yandexgpt/rc

4
4
5

Asynchronous, synchronous

YandexGPT Pro 32k

gpt://<folder_ID>/yandexgpt-32k/deprecated
gpt://<folder_ID>/yandexgpt-32k/latest
gpt://<folder_ID>/yandexgpt/rc

4
4
5

Synchronous1

Llama 8B2

gpt://<folder_ID>/llama-lite/deprecated
gpt://<folder_ID>/llama-lite/latest
gpt://<folder_ID>/llama-lite/rc

3.1
3.1
3.1

Asynchronous, synchronous

Llama 70B2

gpt://<folder_ID>/llama/deprecated
gpt://<folder_ID>/llama/latest
gpt://<folder_ID>/llama/rc

3.3
3.3
3.3

Asynchronous, synchronous

Fine-tuned models

gpt://<basic_model_URI>/<version>@<tuning_suffix>

Depends on the basic model

Asynchronous, synchronous

Model fine-tuned in Yandex DataSphere

ds://<folder_ID>/<fine-tuning_ID>

3

Asynchronous, synchronous

Modified models share usage quotas with their basic models.

1 YandexGPT Pro 32k features an expanded context and is designed specifically to handle large texts in synchronous mode. In asynchronous mode, the YandexGPT Pro model supports the same amount of context.

2 Llama was created by Meta. Meta is designated as an extremist organization and its activities are prohibited in Russia.

Model lifecycleModel lifecycle

Each model has certain lifecycle characteristics, such as the model name, branch, and release date. These characteristics allow you to precisely identify the model version. Below, you can see our rules for updating models. Refer to these rules to adjust your solutions to a new version as apporpriate.

For each model, there are three branches (in the order from the oldest to the newest one): Deprecated, Latest, and Release Candidate (RC). Each of the branches is subject to the SLA.

The RC branch is updated as the new model is ready and may change at any time. When a model in the RC branch is ready for general use, we announce the upcoming release both in the release notes and our Telegram community.

One month after the announcement, the RC version becomes the Latest one, and the Latest version is moved to the Deprecated branch. We continue the support of the Deprecated version for one more month, after which models in the Deprecated and Latest branches become identical.

Accessing modelsAccessing models

You can access text generation models of different versions in a number of ways.

SDK
API

When operating text generation models via Yandex Cloud ML SDK, use one of the following formats:

  • Model name, provided as a string. Only the Latest versions are available.

    model = (
      sdk.models.completions("yandexgpt")
    )
    
  • Model name and version, provided as strings in the model_name and model_version fields, respectively.

    model = (
      sdk.models.completions(model_name="yandexgpt-32k", model_version="rc")
    )
    

    The above example explicitly specifies the Release Candidate version of the YandexGPT Pro 32k model.

  • Model URI, provided as a string containing the full URI of the required model version. You can also use this method to access fine-tuned models.

    model = (
      sdk.models.completions("gpt://b1gt6g8ht345********/llama/deprecated")
    )
    

    The above example explicitly specifies the Deprecated version of the Llama 70B model.

To access a model via the REST API or gRPC API, specify the model's URI containing the folder ID in the modelUri field of the request body. The /latest, /rc, and /deprecated segments indicate the model version. /latest is used by default.

Examples:

  • Accessing the Latest version of the YandexGPT Lite model:

    {
      "modelUri": "gpt://b1gt6g8ht345********/yandexgpt-lite/latest"
      ...
    }
    

    To access the Latest versions, you do not need to specify the model version explicitly because Latest is used by default.

    For example, this URI will allow you to access the Latest version of the YandexGPT Lite model: gpt://<folder_ID>/yandexgpt-lite.

  • Accessing the RC version of the Llama 70B model:

    {
      "modelUri": "gpt://b1gt6g8ht345********/llama/rc"
      ...
    }
    

See alsoSee also

  • Sending a request in prompt mode
  • Sending an asynchronous request

Was the article helpful?

Previous
Overview
Next
Tokens
© 2025 Direct Cursus Technology L.L.C.