Yandex Cloud
Search
Contact UsGet started
  • Pricing
  • Customer Stories
  • Documentation
  • Blog
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • AI Studio
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Start testing with double trial credits
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Center for Technologies and Society
    • Yandex Cloud Partner program
  • Pricing
  • Customer Stories
  • Documentation
  • Blog
© 2025 Direct Cursus Technology L.L.C.
Yandex AI Studio
    • About Yandex AI Studio
      • Overview
      • Available models
      • Function calling
      • Reasoning mode
      • Formatting of model responses
      • Embeddings
      • Datasets
      • Fine-tuning
      • Tokens
    • Quotas and limits
  • Compatibility with OpenAI
  • Access management
  • Pricing policy
  • Audit Trails events
  • Public materials
  • Release notes

In this article:

  • Models available in common instance
  • Model lifecycle
  • Models available in batch mode
  • Text generation models
  • Multimodal models
  • Accessing models
  1. Concepts
  2. Model Gallery
  3. Available models

Text generation models

Written by
Yandex Cloud
Updated at September 23, 2025
  • Models available in common instance
    • Model lifecycle
  • Models available in batch mode
    • Text generation models
    • Multimodal models
  • Accessing models

Yandex AI Studio provides access to large text models from different vendors. If an out-of-the-box model is not enough, you can fine-tune some models to respond to your requests more accurately.

Models available in common instanceModels available in common instance

All basic models are subject to the update rules described in Model lifecycle. When updating models, generations available in different branches (/latest, /rc, and /deprecated segments) may change. Modified models share usage quotas with their basic models.

Model and URI

Generation

Context

Operating modes

YandexGPT Lite
gpt://<folder_ID>/yandexgpt-lite

Deprecated 5
Latest 5
RC 5

32,000

Asynchronous, synchronous

YandexGPT Pro
gpt://<folder_ID>/yandexgpt

Deprecated 5
Latest 5
RC 5.1

32,000

Asynchronous, synchronous

Llama 8B1
gpt://<folder_ID>/llama-lite

Deprecated 3.1
Latest 3.1
RC 3.1

8,192

Asynchronous, synchronous

Llama 70B1
gpt://<folder_ID>/llama

Deprecated 3.3
Latest 3.3
RC 3.3

8,192

Asynchronous, synchronous

Qwen3 235B
gpt://<folder_ID>/qwen3-235b-a22b-fp8/latest

—

256,000

OpenAI API

gpt-oss-120b
gpt://<folder_ID>/gpt-oss-120b/latest

—

128,000

OpenAI API

gpt-oss-20b
gpt://<folder_ID>/gpt-oss-20b/latest

—

128,000

OpenAI API

Fine-tuned models
gpt://<folder_ID>/<basic_model>/<version>@<suffix>

Depends on the basic model

Depends on the basic model

Asynchronous, synchronous

Gemma3 27B
gpt://<folder ID>/gemma-3-27b-it/latest
Gemma Terms of Use

—

128 000

OpenAI API

YandexART
art://<folder_ID>/yandex-art/latest

—

—

Asynchronous

1 Llama was created by Meta. Meta is designated as an extremist organization and its activities are prohibited in Russia.

The Gemma 3 27B model is designed to process Base64-encoded images of any aspect ratio. An adaptive algorithm scales images up to 896 pixels on the largest side, ensuring that important visual details are preserved. Each image requires 256 tokens for processing.

Model lifecycleModel lifecycle

Each model has certain lifecycle characteristics, such as the model name, branch, and release date. These characteristics allow you to precisely identify the model version. Below, you can see our rules for updating models. Refer to these rules to adjust your solutions to a new version as apporpriate.

For each model, there are three branches (in the order from the oldest to the newest one): Deprecated, Latest, and Release Candidate (RC). Each of the branches is subject to the SLA.

The RC branch is updated as the new model is ready and may change at any time. When a model in the RC branch is ready for general use, we announce the upcoming release both in the release notes and our Telegram community.

One month after the announcement, the RC version becomes the Latest one, and the Latest version is moved to the Deprecated branch. We continue the support of the Deprecated version for one more month, after which models in the Deprecated and Latest branches become identical.

Models available in batch modeModels available in batch mode

Text generation modelsText generation models

Model

URI

Context

Qwen2.5 7B Instruct
Model card
Apache 2.0 license

gpt://<folder_ID>/qwen2.5-7b-instruct

32,768

Qwen2.5 72B Instruct
Model card
Qwen license

gpt://<folder_ID>/qwen2.5-72b-instruct

16,384

QwQ 32B Instruct
Model card
Apache 2.0 license

gpt://<folder_ID>/qwq-32b

32,768

Llama-3.3-70B-Instruct2
Model card
Llama 3.3 license

gpt://<folder_ID>/llama3.3-70b-instruct

8,192

Llama-3.1-70B-Instruct2
Model card
Llama 3.1 license

gpt://<folder_ID>/llama3.1-70b-instruct

8,192

DeepSeek-R1-Distill-Llama-70B
Model card
MIT license
Based on Llama-3.3-70B-Instruct. Llama-3.3-70B-Instruct Terms of Use

gpt://<folder_ID>/deepseek-r1-distill-llama-70b

8,192

Qwen2.5 32B Instruct
Model card
Apache 2.0 license

gpt://<folder_ID>/qwen2.5-32b-instruct

32,768

DeepSeek-R1-Distill-Qwen-32B
Model card
MIT license

gpt://<folder_ID>/deepseek-r1-distill-qwen-32b

32,768

phi-4
Model card
MIT license

gpt://<folder_ID>/phi-4

16,384

Gemma3 1B it
Model card
Gemma Terms of Use

gpt://<folder_ID>/gemma-3-1b-it

32,768

Gemma3 4B it
Model card
Gemma Terms of Use

gpt://<folder_ID>/gemma-3-4b-it

131,072

Gemma3 12B it
Model card
Gemma Terms of Use

gpt://<folder_ID>/gemma-3-12b-it

65,536

Gemma3 27B it
Model card
Gemma Terms of Use

gpt://<folder_ID>/gemma-3-27b-it

32,768

Qwen3-0.6B
Model card
Apache 2.0 license

gpt://<folder_ID>/qwen3-0.6b

32,768

Qwen3-1.7B
Model card
Apache 2.0 license

gpt://<folder_ID>/qwen3-1.7b

32,768

Qwen3-4B
Model card
Apache 2.0 license

gpt://<folder_ID>/qwen3-4b

32,768

Qwen3-8B
Model card
Apache 2.0 license

gpt://<folder_ID>/qwen3-8b

32,768

Qwen3-14B
Model card
Apache 2.0 license

gpt://<folder_ID>/qwen3-14b

32,768

Qwen3-32B
Model card
Apache 2.0 license

gpt://<folder_ID>/qwen3-32b

32,768

Qwen3-30B-A3B
Model card
Apache 2.0 license

gpt://<folder_ID>/qwen3-30b-a3b

32,768

Qwen3-235B-A22B
Model card
Apache 2.0 license

gpt://<folder_ID>/qwen3-235b-a22b

32,768

2 Llama was created by Meta. Meta is designated as an extremist organization and its activities are prohibited in Russia.

Multimodal modelsMultimodal models

Model URI Context
Qwen2 VL 7B
Model card
Apache 2.0 license
gpt://<folder_ID>/qwen2-vl-7b-instruct/ 4096
Qwen2.5 VL 7B
Model card
Apache 2.0 license
gpt://<folder_ID>/qwen2.5-vl-7b-instruct/ 4096
Qwen 2.5 VL 32B Instruct
Model card
Apache 2.0 license
gpt://<folder_ID>/qwen2.5-vl-32b-instruct/ 4096
DeepSeek 2 VL
Model card
DeepSeek license
gpt://<folder_ID>/deepseek-vl2/ 4096
DeepSeek 2 VL Tiny
Model card
DeepSeek license
gpt://<folder_ID>/deepseek-vl2-tiny/ 4096
Gemma3 4B it
Model card
Gemma Terms of Use
gpt://<folder_ID>/gemma-3-4b-it/ 4096
Gemma3 12B it
Model card
Gemma Terms of Use
gpt://<folder_ID>/gemma-3-12b-it/ 4096
Gemma3 27B it
Model card
Gemma Terms of Use
gpt://<folder_ID>/gemma-3-27b-it/ 4096

Accessing modelsAccessing models

You can access text generation models of different versions in a number of ways.

SDK
API

When operating text generation models via Yandex Cloud ML SDK, use one of the following formats:

  • Model name, provided as a string. Only the Latest versions are available.

    # Text generation
    model = (
      sdk.models.completions("yandexgpt")
    )
    
    # Image generation
    model = (
      sdk.models.image_generation("yandex-art")
    )
    
  • Model name and version, provided as strings in the model_name and model_version fields, respectively.

    # Text generation
    model = (
      sdk.models.completions(model_name="yandexgpt-lite", model_version="rc")
    )
    
    # Image generation
    model = (
      sdk.models.image_generation(model_name="yandex-art", model_version="latest")
    )
    

    The above example explicitly specifies the Release Candidate of the YandexGPT Lite model and the Latest of the YandexART model.

  • Model URI, provided as a string containing the full URI of the required model version. You can also use this method to access fine-tuned models.

    # Text generation
    model = (
      sdk.models.completions("gpt://b1gt6g8ht345********/llama/deprecated")
    )
    
    # Image generation
    model = (
      sdk.models.image_generation("art://b1gt6g8ht345********/yandex-art/latest")
    )
    

    The above example explicitly specifies the Deprecated version of the Llama 70B model and the Latest of the YandexART model.

To access a model via the REST API or gRPC API, specify the model's URI containing the folder ID in the modelUri field of the request body. The /latest, /rc, and /deprecated segments indicate the model version. /latest is used by default.

Examples:

  • Accessing the Latest versions of the YandexGPT Lite and YandexART models:

    {
      "modelUri": "gpt://b1gt6g8ht345********/yandexgpt-lite/latest"
      ...
    
      "modelUri": "art://b1gt6g8ht345********/yandex-art/latest"
    }
    

    To access the Latest versions, you do not need to specify the model version explicitly because Latest is used by default.

  • Accessing the RC version of the Llama 70B model:

    {
      "modelUri": "gpt://b1gt6g8ht345********/llama/rc"
      ...
    }
    

See alsoSee also

  • Sending a request in prompt mode
  • Sending an asynchronous request
  • Generating an image using YandexART
  • Running a model in batch mode

Was the article helpful?

Previous
Overview
Next
Function calling
© 2025 Direct Cursus Technology L.L.C.