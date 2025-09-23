Text generation models
Yandex AI Studio provides access to large text models from different vendors. If an out-of-the-box model is not enough, you can fine-tune some models to respond to your requests more accurately.
Models available in common instance
All basic models are subject to the update rules described in Model lifecycle. When updating models, generations available in different branches (
/latest,
/rc, and
/deprecated segments) may change. Modified models share usage quotas with their basic models.
|
Model and URI
|
Generation
|
Context
|
YandexGPT Lite
|
Deprecated 5Latest 5RC 5
|
32,000
|
Asynchronous, synchronous
|
YandexGPT Pro
|
Deprecated 5Latest 5RC 5.1
|
32,000
|
Asynchronous, synchronous
|
Llama 8B1
|
Deprecated 3.1Latest 3.1RC 3.1
|
8,192
|
Asynchronous, synchronous
|
Llama 70B1
|
Deprecated 3.3Latest 3.3RC 3.3
|
8,192
|
Asynchronous, synchronous
|
Qwen3 235B
|
—
|
256,000
|
gpt-oss-120b
|
—
|
128,000
|
gpt-oss-20b
|
—
|
128,000
|
Fine-tuned models
|
Depends on the basic model
|
Depends on the basic model
|
Asynchronous, synchronous
|
Gemma3 27B
|
—
|
128 000
|
YandexART
|
—
|
—
|
Asynchronous
1 Llama was created by Meta. Meta is designated as an extremist organization and its activities are prohibited in Russia.
The Gemma 3 27B model is designed to process Base64-encoded images of any aspect ratio. An adaptive algorithm scales images up to 896 pixels on the largest side, ensuring that important visual details are preserved. Each image requires 256 tokens for processing.
Model lifecycle
Each model has certain lifecycle characteristics, such as the model name, branch, and release date. These characteristics allow you to precisely identify the model version. Below, you can see our rules for updating models. Refer to these rules to adjust your solutions to a new version as apporpriate.
For each model, there are three branches (in the order from the oldest to the newest one):
Deprecated,
Latest, and
Release Candidate (
RC). Each of the branches is subject to the SLA.
The
RC branch is updated as the new model is ready and may change at any time. When a model in the
RC branch is ready for general use, we announce the upcoming release both in the release notes and our Telegram community.
One month after the announcement, the
RC version becomes the
Latest one, and the
Latest version is moved to the
Deprecated branch. We continue the support of the
Deprecated version for one more month, after which models in the
Deprecated and
Latest branches become identical.
Models available in batch mode
Text generation models
|
Model
|
URI
|
Context
|
Qwen2.5 7B Instruct
|
|
32,768
|
Qwen2.5 72B Instruct
|
|
16,384
|
QwQ 32B Instruct
|
|
32,768
|
Llama-3.3-70B-Instruct2
|
|
8,192
|
Llama-3.1-70B-Instruct2
|
|
8,192
|
DeepSeek-R1-Distill-Llama-70B
|
|
8,192
|
Qwen2.5 32B Instruct
|
|
32,768
|
DeepSeek-R1-Distill-Qwen-32B
|
|
32,768
|
phi-4
|
|
16,384
|
Gemma3 1B it
|
|
32,768
|
Gemma3 4B it
|
|
131,072
|
Gemma3 12B it
|
|
65,536
|
Gemma3 27B it
|
|
32,768
|
Qwen3-0.6B
|
|
32,768
|
Qwen3-1.7B
|
|
32,768
|
Qwen3-4B
|
|
32,768
|
Qwen3-8B
|
|
32,768
|
Qwen3-14B
|
|
32,768
|
Qwen3-32B
|
|
32,768
|
Qwen3-30B-A3B
|
|
32,768
|
Qwen3-235B-A22B
|
|
32,768
Multimodal models
|Model
|URI
|Context
|Qwen2 VL 7BModel cardApache 2.0 license
|
gpt://<folder_ID>/qwen2-vl-7b-instruct/
|4096
|Qwen2.5 VL 7BModel cardApache 2.0 license
|
gpt://<folder_ID>/qwen2.5-vl-7b-instruct/
|4096
|Qwen 2.5 VL 32B InstructModel cardApache 2.0 license
|
gpt://<folder_ID>/qwen2.5-vl-32b-instruct/
|4096
|DeepSeek 2 VLModel cardDeepSeek license
|
gpt://<folder_ID>/deepseek-vl2/
|4096
|DeepSeek 2 VL TinyModel cardDeepSeek license
|
gpt://<folder_ID>/deepseek-vl2-tiny/
|4096
|Gemma3 4B itModel cardGemma Terms of Use
|
gpt://<folder_ID>/gemma-3-4b-it/
|4096
|Gemma3 12B itModel cardGemma Terms of Use
|
gpt://<folder_ID>/gemma-3-12b-it/
|4096
|Gemma3 27B itModel cardGemma Terms of Use
|
gpt://<folder_ID>/gemma-3-27b-it/
|4096
Accessing models
You can access text generation models of different versions in a number of ways.
When operating text generation models via Yandex Cloud ML SDK, use one of the following formats:
-
Model name, provided as a string. Only the
Latestversions are available.
# Text generation model = ( sdk.models.completions("yandexgpt") ) # Image generation model = ( sdk.models.image_generation("yandex-art") )
-
Model name and version, provided as strings in the
model_nameand
model_versionfields, respectively.
# Text generation model = ( sdk.models.completions(model_name="yandexgpt-lite", model_version="rc") ) # Image generation model = ( sdk.models.image_generation(model_name="yandex-art", model_version="latest") )
The above example explicitly specifies the
Release Candidateof the
YandexGPT Litemodel and the
Latestof the
YandexARTmodel.
-
Model URI, provided as a string containing the full URI of the required model version. You can also use this method to access fine-tuned models.
# Text generation model = ( sdk.models.completions("gpt://b1gt6g8ht345********/llama/deprecated") ) # Image generation model = ( sdk.models.image_generation("art://b1gt6g8ht345********/yandex-art/latest") )
The above example explicitly specifies the
Deprecatedversion of the
Llama 70Bmodel and the
Latestof the
YandexARTmodel.
To access a model via the REST API or gRPC API, specify the model's URI containing the folder ID in the
modelUri field of the request body. The
/latest,
/rc, and
/deprecated segments indicate the model version.
/latest is used by default.
Examples:
-
Accessing the
Latestversions of the
YandexGPT Liteand
YandexARTmodels:
{ "modelUri": "gpt://b1gt6g8ht345********/yandexgpt-lite/latest" ... "modelUri": "art://b1gt6g8ht345********/yandex-art/latest" }
To access the
Latestversions, you do not need to specify the model version explicitly because
Latestis used by default.
-
Accessing the
RCversion of the
Llama 70Bmodel:
{ "modelUri": "gpt://b1gt6g8ht345********/llama/rc" ... }