About Yandex Foundation Models
Yandex Foundation Models comprises several large generative neural networks and allows you to leverage their capabilities for your business development.
YandexGPT is geared to address various needs related to creating text content. YandexGPT API can generate product descriptions, articles, news stories, newsletters, blog posts, and many other things. The quality of the neural network's response depends directly on the accuracy of the instructions you provide. With a more specific prompt, you are more likely to get the result you expect.
Foundation Models provides the API to work with embeddings, i.e., vector representations of text. It can be used to classify information, compare and match texts, or search through a knowledge base of your own. For more information on embeddings and the Embeddings API, see Text vectorization.
The YandexART neural network will help you create detailed and realistic images based on a text prompt. You can see prompt examples in the YandexART prompt library.
The service is dynamically evolving with constant enhancements and refinements to its functionality.
For information on the Foundation Models restrictions, refer to Quotas and limits in Yandex Foundation Models.
Foundation Models working modes
In Foundation Models, you can use models in either synchronous or asynchronous mode. The modes differ in response time and operation logic.
In synchronous mode, the model gets your request and returns the result immediately after processing. The response delay in synchronous mode is minimal but not instant: the model takes time to do the work. With the stream
option enabled, the model sends intermediate generation results during the process. You may opt for synchronous mode if you need to maintain a chatbot dialog.
In asynchronous mode, the model responds to a request by sending an Operation object containing the ID of the operation it is performing. You can use the ID to learn the status of the request and later get the result of it by submitting a request to a special output endpoint. Intermediate generation results are not available in asynchronous mode. In asynchronous mode, generation usually takes longer (from a couple of minutes to several hours) than in synchronous mode but is cheaper. Use asynchronous mode if you do not need an urgent response.
Different models support different operating modes.
Prompt
Generative models are managed using prompts. A good prompt should contain the context of your request to the model (instruction) and the actual task the model should complete based on the provided context. The more specific your prompt, the more accurate will be the results returned by the model.
Apart from the prompt, other request parameters will impact the model's output too. Use Foundation Models Playground available from the management console