Yandex Cloud
Search
Contact UsGet started
  • Pricing
  • Customer Stories
  • Documentation
  • Blog
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • ML Services
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Start testing with double trial credits
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Center for Technologies and Society
    • Yandex Cloud Partner program
  • Pricing
  • Customer Stories
  • Documentation
  • Blog
© 2025 Direct Cursus Technology L.L.C.
Yandex AI Studio
    • About Yandex AI Studio
      • Overview
      • Common instance models
      • Dedicated instance models
      • Batch processing
      • Function calling
      • Reasoning mode
      • Formatting model responses
      • Embeddings
      • Datasets
      • Fine-tuning
      • Tokens
    • Yandex Workflows
    • Quotas and limits
    • Terms and definitions
  • Compatibility with OpenAI
  • Access management
  • Pricing policy
  • Audit Trails events
  • Public materials
  • Release notes

In this article:

  • Model lifecycle
  • Accessing models
  1. Concepts
  2. Model Gallery
  3. Common instance models

Common instance models

Written by
Yandex Cloud
Updated at October 13, 2025
  • Model lifecycle
  • Accessing models

Yandex AI Studio provides access to large generative models from different vendors. If out-of-the-box models are not enough, you can fine-tune some of them for more accurate responses. All roles required for working with the models are listed in Access management in Yandex AI Studio.

In a common instance, model resources are available to all Yandex Cloud users and shared between them, so model response time may increase under heavy workloads. We guarantee that no other user can access the context of your exchanges with the model: even with logging on, requests are stored anonymized and potentially sensitive information is masked. However, we recommend disabling data logging whenever you use our models to process sensitive information.

Common instance models are subject to the update rules described in Model lifecycle. When updating models, generations available in different branches (/latest, /rc, and /deprecated segments) may change. Modified models share usage quotas with their basic models.

Model and URI

Generation

Context

Operating modes

YandexGPT Lite
gpt://<folder_ID>/yandexgpt-lite

Deprecated 5
Latest 5
RC 5

32,000

Asynchronous, synchronous

YandexGPT Pro
gpt://<folder_ID>/yandexgpt

Deprecated 5
Latest 5
RC 5.1

32,000

Asynchronous, synchronous

Qwen3 235B
gpt://<folder_ID>/qwen3-235b-a22b-fp8/latest

—

256,000

OpenAI API

gpt-oss-120b
gpt://<folder_ID>/gpt-oss-120b/latest

—

128,000

OpenAI API

gpt-oss-20b
gpt://<folder_ID>/gpt-oss-20b/latest

—

128,000

OpenAI API

Fine-tuned text models
gpt://<folder_ID>/<basic_model>/<version>@<suffix>

Depends on the basic model

Depends on the basic model

Asynchronous, synchronous

Gemma 3 27B
gpt://<folder_ID>/gemma-3-27b-it/latest
Gemma Terms of Use

—

128 000

OpenAI API

YandexART
art://<folder_ID>/yandex-art/latest

—

—

Asynchronous

1 Llama was created by Meta. Meta is designated as an extremist organization and its activities are prohibited in Russia.

Gemma 3 27B processes Base64-encoded images. The model can handle images of any aspect ratio thanks to an adaptive algorithm that scales the longer side of the image to 896 pixels while preserving important visual details. Each image uses 256 context tokens.

Model lifecycleModel lifecycle

Each model has a number of lifecycle characteristics, such as model name, branch, and release date. These allow you to uniquely identify the model version. Refer to the model update rules provided below to adapt your solutions to the new version as needed.

There are three model branches (from old to new): Deprecated, Latest, Release Candidate (RC). Each of the branches is subject to the SLA.

The RC branch is updated as the new model is ready and may change at any time. When a model in the RC branch is ready for general use, we announce the upcoming release both in the release notes and our Telegram community.

One month after the announcement, the RC version becomes the Latest one, and the Latest version is moved to the Deprecated branch. We continue the support of the Deprecated version for one more month, after which models in the Deprecated and Latest branches become identical.

Accessing modelsAccessing models

You can access text generation models of different versions in a number of ways.

SDK
API

When operating text generation models via Yandex Cloud ML SDK, use one of the following formats:

  • Model name, provided as a string. Only the Latest versions are available.

    # Text generation
    model = (
      sdk.models.completions("yandexgpt")
    )
    
    # Image generation
    model = (
      sdk.models.image_generation("yandex-art")
    )
    
  • Model name and version, provided as strings in the model_name and model_version fields, respectively.

    # Text generation
    model = (
      sdk.models.completions(model_name="yandexgpt-lite", model_version="rc")
    )
    
    # Image generation
    model = (
      sdk.models.image_generation(model_name="yandex-art", model_version="latest")
    )
    

    This example explicitly specifies the YandexGPT Lite model of the Release Candidate version and the YandexART model of the Latest version.

  • Model URI, provided as a string containing the full URI of the required model version. You can also use this method to access fine-tuned models.

    # Text generation
    model = (
      sdk.models.completions("gpt://b1gt6g8ht345********/yandexgpt/deprecated")
    )
    
    # Image generation
    model = (
      sdk.models.image_generation("art://b1gt6g8ht345********/yandex-art/latest")
    )
    

    This example explicitly specifies the YandexGPT Pro model of the Deprecated version and the YandexART model of the Latest version.

To access YandexGPT models via the REST API or gRPC API, specify the model URI containing the folder ID in the modelUri field of the request body. The /latest, /rc, and /deprecated segments indicate the model version. /latest is used by default.

To access a YandexART model via the REST API or gRPC API, specify the model URI containing the folder ID in the modelUri field of the request body. The /latest segment indicates the model version and is optional.

  • Accessing the Latest versions:

    {
    
      "modelUri": "gpt://b1gt6g8ht345********/yandexgpt-lite/latest"
      ...
    
      "modelUri": "art://b1gt6g8ht345********/yandex-art/latest"
    }
    
  • Accessing the RC version (if available):

    {
    
      "modelUri": "gpt://b1gt6g8ht345********/yandexgpt-lite/rc"
      ...
    }
    

See alsoSee also

  • Sending a request in prompt mode
  • Sending an asynchronous request
  • Generating an image using YandexART
  • Running a model in batch mode

Was the article helpful?

Previous
Overview
Next
Dedicated instance models
© 2025 Direct Cursus Technology L.L.C.