About Yandex Foundation Models
Yandex Foundation Models comprises several large generative neural networks plus an efficient toolset you can use to leverage their capabilities to advance your business. Foundation Models is a part of Yandex Cloud AI Studio.
YandexGPT API offers text content generation models. You can use it to generate product descriptions, articles, news stories, newsletters, blog posts, and many other things. The quality of the neural network's response depends directly on the accuracy of the instructions you provide. With a more specific prompt, you are more likely to get the result you expect. For the full list of generative text models, see Text generation models.
Foundation Models also provides the API to work with embeddings, i.e., vector representations of text. It can be used to classify information, compare and match texts, or search through a knowledge base of your own. For more information on embeddings and the Embeddings API, see Text vectorization.
With YandexGPT classifiers, you can classify various texts. Special models are better at it than the YandexGPT Liteand YandexGPT Pro models, their API being tailored for classification tasks. For more information about the supported classification types, see Classifiers based on YandexGPT.
To create images in Foundation Models use the YandexART neural network that will help you create detailed and realistic images based on a text prompt.
For information on the Foundation Models restrictions, refer to Quotas and limits in Yandex Foundation Models.
Foundation Models working modes
In Foundation Models, you can use models in either synchronous or asynchronous mode. The modes differ in response time and operation logic.
In synchronous mode, the model gets your request and returns the result immediately after processing. The response delay in synchronous mode is minimal but not instant: the model takes time to do the work. With the stream
option enabled, the model sends intermediate generation results during the process. You may opt for synchronous mode if you need to maintain a chatbot dialog.
In asynchronous mode, the model responds to a request by sending an Operation object containing the ID of the operation it is performing. You can use the ID to learn the status of the request and later get the result of it by submitting a request to a special output endpoint. Intermediate generation results are not available in asynchronous mode. In asynchronous mode, generation usually takes longer (from a couple of minutes to several hours) than in synchronous mode but is cheaper. Use asynchronous mode if you do not need an urgent response.
Different models support different operating modes.
Prompt
Generative models are managed using prompts. A good prompt should contain the context of your request to the model (instruction) and the actual task the model should complete based on the provided context. The more specific your prompt, the more accurate will be the results returned by the model.
Apart from the prompt, other request parameters will impact the model's output too. Use Foundation Models Playground available from the management console