Text generation models
Yandex Foundation Models provides access to large text models from different vendors. If out-of-the-box models are not enough, you can fine-tune some of them for more accurate responses to your requests.
Generation models
All basic models are subject to the update rules described in Model lifecycle. When updating models, generations available in different branches (/latest
, /rc
, and /deprecated
segments) may change.
Model |
URI |
Generation |
|
YandexGPT Lite |
|
344 |
Asynchronous, synchronous |
YandexGPT Pro |
|
344 |
Asynchronous, synchronous |
YandexGPT Pro 32k |
|
444 |
Synchronous2 |
Llama 8b1 |
|
3.13.13.1 |
Asynchronous, synchronous |
Llama 70b1 |
|
3.13.13.1 |
Asynchronous, synchronous |
Fine-tuned model |
|
Depends on the basic model |
Asynchronous, synchronous |
Model fine-tuned in Yandex DataSphere |
|
3 |
Asynchronous, synchronous |
Modified models share usage quotas with their basic models.
1 Llama was created by Meta. Meta is designated as an extremist organization and its activities are prohibited in Russia.
2 YandexGPT Pro 32k features an expanded context and is designed specifically to handle large texts in synchronous mode. In asynchronous mode, the YandexGPT Pro model supports the same amount of context.
Model lifecycle
Each model has certain lifecycle characteristics, such as the model name, branch, and release date. These characteristics allow you to precisely identify the model version. Below, you can see our rules for updating models. Refer to these rules to adjust your solutions to a new version as apporpriate.
For each model, there are three branches (in the order from the oldest to the newest one): Deprecated
, Latest
, and Release Candidate
(RC
). Each of the branches is subject to the SLA.
The RC
branch is updated as the new model is ready and may change at any time. When a model in the RC
branch is ready for general use, we announce the upcoming release both in the release notes and our Telegram community
One month after the announcement, the RC
version becomes the Latest
one, and the Latest
version is moved to the Deprecated
branch. We continue the support of the Deprecated
version for one more month, after which models in the Deprecated
and Latest
branches become identical.
Accessing models
You can access text generation models of different versions in a number of ways.
When operating text generation models via Yandex Cloud ML SDK, use one of the following formats:
-
Model name, provided as a string. Only the
Latest
versions are available.model = ( sdk.models.completions("yandexgpt") )
-
Model name and version, provided as strings in the
model_name
andmodel_version
fields, respectively.model = ( sdk.models.completions(model_name="yandexgpt-32k", model_version="rc") )
The above example explicitly specifies the
Release Candidate
version of theYandexGPT Pro 32k
model. -
Model URI, provided as a string containing the full URI of the required model version. You can also use this method to access fine-tuned models.
model = ( sdk.models.completions("gpt://b1gt6g8ht345********/llama/deprecated") )
The above example explicitly specifies the
Deprecated
version of theLlama 70b
model.
To access a model via the REST API or gRPC API, specify the model's URI containing the folder ID in the modelUri
field of the request body. The /latest
, /rc
, and /deprecated
segments indicate the model version. /latest
is used by default.
Examples:
-
Accessing the
Latest
version of theYandexGPT Lite
model:{ "modelUri": gpt://b1gt6g8ht345********/yandexgpt-lite/latest ... }
To access the
Latest
versions, you do not need to specify the model version explicitly becauseLatest
is used by default.For example, this URI will allow you to access the
Latest
version of theYandexGPT Lite
model:gpt://<folder_ID>/yandexgpt-lite
. -
Accessing the
RC
version of theLlama 70b
model:{ "modelUri": gpt://b1gt6g8ht345********/llama/rc ... }