Yandex Cloud
Search
Contact UsTry it for free
  • Customer Stories
  • Documentation
  • Blog
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • AI for business
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Start testing with double trial credits
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Center for Technologies and Society
    • Yandex Cloud Partner program
    • Price calculator
    • Pricing plans
  • Customer Stories
  • Documentation
  • Blog
© 2025 Direct Cursus Technology L.L.C.
Yandex AI Studio
  • Getting started with Model Gallery
    • About Yandex AI Studio
      • Overview
      • Common instance models
      • Dedicated instance models
      • Batch processing
      • Function calling
      • Reasoning mode
      • Formatting model responses
      • Embeddings
      • Datasets
      • Fine-tuning
      • Tokens
    • Yandex Workflows
    • Quotas and limits
    • Terms and definitions
  • Switching from the AI Assistant API to Responses API
  • Compatibility with OpenAI
  • Access management
  • Pricing policy
  • Audit Trails events
  • Public materials
  • Release notes

In this article:

  • Native Yandex models
  • AI Studio operating modes
  1. Concepts
  2. Model Gallery
  3. Overview

Overview of Yandex AI Studio AI models

Written by
Yandex Cloud
Updated at December 12, 2025
  • Native Yandex models
  • AI Studio operating modes

Yandex AI Studio provides powerful capabilities for the use of generative models in business scenarios:

  • Native and open-source common instance models billed based on consumed tokens.
  • LoRA-based model fine-tuning.
  • Out-of-the-box and tunable text classification models.
  • Large selection of text and multimodal open-source models to batch-process large volumes of data with a prepaid minimum amount of tokens.
  • Dedicated model instances, if you are looking to process large volumes of data with guaranteed response time.

There are two interfaces you can use to work with models: AI Playground in the management console and various APIs where you can create agents and access models directly.

Native Yandex modelsNative Yandex models

Model Gallery brings to you Yandex's text and image generation models you can use for your business.

The smallest and fastest one among the text models, YandexGPT Lite excels at tasks that prioritize response speed over complex reasoning or in-depth knowledge of sophisticated subject areas. For example, you can use YandexGPT Lite to categorize incoming user messages, format texts, or summarize your meetings.

YandexGPT Pro will perform well in more complex tasks: searching knowledge bases and generating results based on the output (RAG scenario), document analysis, reporting and analytics, data extraction and auto-population of fields, forms, and CRM databases.

Alice AI LLM, Yandex's new flagship model, is not only head-to-head with YandexGPT Pro in complex tasks, but also a much better dialog partner in chat scenarios, capable of extracting information from the whole context. Alice AI LLM will be ideal for creating human-oriented AI assistants.

Yandex text models can understand around 20 languages, including English and Japanese, their primary focus is to be an effective processing engine for texts in Russian. With Yandex's proprietary tokenizer, its models are more efficient with tokens than other models out there, thus saving you money. For an example calculation of the cost of doing the same task using different models, see the pricing page.

Apart from text models, Model Gallery also features the YandexART model, a generative neural network that creates images based on a text query. YandexART uses the cascaded diffusion method to iteratively refine images from noise. You can specify the format of the final image in the mime_type parameter. Currently, the supported value is image/jpeg. By default, YandexART generates an image of 1024 x 1024 pixels. This size may increase or decrease based on the specified aspect ratio, but by no more than 10%.

Yandex text models are available through the OpenAI-compatible Completions API and Responses API, as well as a proprietary REST and gRPC text generation API.
YandexART provides a proprietary image generation API, also available as REST and gRPC.

Moreover, all the models can be reached through ML SDK and AI Playground.

AI Studio operating modesAI Studio operating modes

In AI Studio, you can use models in three modes: synchronous, asynchronous, or batch mode. The modes have different response times and operating logic.

In synchronous mode, the model gets your request and returns the result immediately after processing. The response delay in synchronous mode is minimal but not instant: the model still needs some time, which depends on the model and system workload. With the stream option enabled, the model sends intermediate generation results during the process. You may opt for synchronous mode if you need to maintain a chatbot dialog. In synchronous mode, models are available in AI Playground, ML SDK, via text generation APIs, and OpenAI-compatible APIs.

In asynchronous mode, the model responds to a request by sending an Operation object containing the ID of the operation it is performing. You can use the ID to learn the status of the request and later get the result of it by submitting a request to a special output endpoint (its value depends on the model). Intermediate generation results are not available in asynchronous mode. In asynchronous mode, generation usually takes longer (from a couple of minutes to several hours) than in synchronous mode but is cheaper. Use asynchronous mode if you do not need an urgent response. In asynchronous mode, some models are available in ML SDK, via text generation APIs, and image generation APIs.

Batch processing mode allows you to process a large data array in a single request to the model. Input data is provided as a dataset whose type depends on the model. For each request, AI Studio runs an individual instance of the model to process the dataset and then stops it. The result is saved as another dataset, which you can download in Parquet format or use immediately, e.g., to tune another model. It may take several hours to generate the result. You can process data in batch mode in the management console, using ML SDK , or via the Batch API. For the list of models available in batch mode, see Batch processing.

See alsoSee also

  • Reasoning mode in generative models
  • Sending a request in prompt mode
  • Sending an asynchronous request

Was the article helpful?

Previous
About Yandex AI Studio
Next
Common instance models
© 2025 Direct Cursus Technology L.L.C.