About Yandex Foundation Models

Written by

Updated at May 13, 2025

Foundation Models working modes
Prompt

Yandex Foundation Models comprises several large generative models plus an efficient toolset you can use to leverage their capabilities to advance your business. Foundation Models is a part of Yandex Cloud AI Studio.

With YandexGPT Lite and YandexGPT Pro, you can quickly generate text content, e.g., product descriptions, articles, news stories, newsletters, blog posts, and many other things. The quality of the neural network's response depends directly on the accuracy of the instructions you provide. With a more specific prompt, you are more likely to get the result you expect. For the full list of generative text models, see Text generation models.

Foundation Models also provides the API to work with embeddings, i.e., vector representations of text. It can be used to classify information, compare and match texts, or search through a knowledge base of your own. For more information on embeddings and the Embeddings API, see Text vectorization.

With YandexGPT classifiers, you can classify various texts. Special models are better at it than the YandexGPT Liteand YandexGPT Pro models, their API being tailored for classification tasks. For more information about the supported classification types, see Classifiers based on YandexGPT.

To create images in Foundation Models use the YandexART neural network that will help you create detailed and realistic images based on a text prompt.

In addition to models working with a single type of data, Foundation Models provides multimodal models.

For information on the Foundation Models restrictions, refer to Quotas and limits in Yandex Foundation Models.

Foundation Models working modes

In Foundation Models, you can use models in three modes: synchronous, asynchronous, or batch mode. The modes differ in response time and operation logic.

In synchronous mode, the model gets your request and returns the result immediately after processing. The response delay in synchronous mode is minimal but not instant: the model takes time to do the work. With the stream option enabled, the model sends intermediate generation results during the process. You may opt for synchronous mode if you need to maintain a chatbot dialog.

In asynchronous mode, the model responds to a request by sending an Operation object containing the ID of the operation it is performing. You can use the ID to learn the status of the request and later get the result of it by submitting a request to a special output endpoint (its value depends on the model). Intermediate generation results are not available in asynchronous mode. In asynchronous mode, generation usually takes longer (from a couple of minutes to several hours) than in synchronous mode but is cheaper. Use asynchronous mode if you do not need an urgent response.

Batch processing mode allows you to process a large data array in a single request to the model. Input data is provided as a dataset whose type depends on the model. For each request, Foundation Models runs an individual instance of the model to process the dataset and then stops it. The result is saved as another dataset, which you can download in Parquet format or use immediately, e.g., to tune another model. It may take several hours to generate the result.

Different models support different operating modes.

Prompt

Generative models are managed using prompts. A good prompt should contain the context of your request to the model (instruction) and the actual task the model should complete based on the provided context. The more specific your prompt, the more accurate will be the results returned by the model.

Apart from the prompt, other request parameters will impact the model's output too. Use Foundation Models Playground available from the management console to test your requests.

About Yandex Foundation Models

Foundation Models working modes

Prompt

Was the article helpful?