Migration from API v1alpha to YandexGPT API v1 and Embeddings API v1
API v1alpha is deprecated and discontinued. To work with YandexGPT, use YandexGPT API v1 and Embeddings API v1. In the new API version, the maximum total number of tokens allowed per user request and model response is 8192.
If your product adopts methods of the deprecated API, migrate it to the new interface. See the detailed overview of the changes required for the REST API below. Similar changes apply to the gRPC API.
Model selection
In YandexGPT API v1 and Embeddings API v1, specify the model name in the modelUri
parameter (instead of model
in the deprecated API):
Model | API v1alpha | YandexGPT API v1 and Embeddings API v1 |
---|---|---|
YandexGPT Pro | "model": "yagpt-2.0:hq" |
"modelUri": "gpt://<folder_ID>/yandexgpt/latest" |
YandexGPT Lite | "model": "general" |
"modelUri": "gpt://<folder_ID>/yandexgpt-lite/latest" |
Generation
TextGeneration.instruct (prompt mode)
API endpoint:
API v1alpha | YandexGPT API v1 |
---|---|
https://llm.api.cloud.yandex.net/llm/v1alpha/instruct |
https://llm.api.cloud.yandex.net/foundationModels/v1/completion |
General request structure:
API v1alpha |
YandexGPT API v1 |
|
|
Request body fields:
API v1alpha |
YandexGPT API v1 |
Description |
model |
modelUri |
ID of the model to use for response generation. The parameter contains the ID of a Yandex Cloud folder or the ID of a model fine-tuned in DataSphere. |
instructionText |
|
In YandexGPT API v1, the
|
instructionUri |
modelUri |
YandexGPT API v1 does not use the |
requestText |
|
In YandexGPT API v1, the
|
partialResults |
stream |
It enables streaming of partially generated text. It may take either the |
generationOptions |
completionOptions |
It sets the request configuration parameters. |
maxTokens |
maxTokens |
The |
TextGeneration.chat (chat mode)
API endpoint:
API v1alpha | YandexGPT API v1 |
---|---|
https://llm.api.cloud.yandex.net/llm/v1alpha/chat |
https://llm.api.cloud.yandex.net/foundationModels/v1/completion |
General request structure:
API v1alpha |
YandexGPT API v1 |
|
|
Request body fields:
API v1alpha |
YandexGPT API v1 |
Description |
model |
modelUri |
ID of the model to be used to generate the response. The parameter contains the ID of a Yandex Cloud folder or the ID of a model fine-tuned in DataSphere. |
instructionText |
|
In YandexGPT API v1, the
|
partialResults |
stream |
It enables streaming of partially generated text. It may take either the |
generationOptions |
completionOptions |
It sets the request configuration parameters. |
maxTokens |
maxTokens |
The |
role |
role |
The name of the |
API endpoint:
API v1alpha | YandexGPT API v1 |
---|---|
https://llm.api.cloud.yandex.net/llm/v1alpha/instructAsync |
https://llm.api.cloud.yandex.net/foundationModels/v1/completionAsync |
General request structure:
API v1alpha |
YandexGPT API v1 |
|
|
Request body fields:
API v1alpha |
YandexGPT API v1 |
Description |
model |
modelUri |
ID of the model to be used to generate the response. The parameter contains the ID of a Yandex Cloud folder or the ID of a model fine-tuned in DataSphere. |
instructionText |
|
In YandexGPT API v1, the
|
instructionUri |
modelUri |
YandexGPT API v1 does not use the |
requestText |
|
In YandexGPT API v1, the
|
partialResults |
stream |
It enables streaming of partially generated text. It may take either the |
generationOptions |
completionOptions |
It sets the request configuration parameters. |
maxTokens |
maxTokens |
The |
Tokenization
You can use this method with generation methods only.
API endpoint:
API v1alpha | YandexGPT API v1 |
---|---|
https://llm.api.cloud.yandex.net/llm/v1alpha/tokenize |
https://llm.api.cloud.yandex.net/foundationModels/v1/tokenizeCompletion |
General request structure:
API v1alpha |
YandexGPT API v1 |
|
|
Request body fields:
API v1alpha |
YandexGPT API v1 |
Description |
model |
modelUri |
ID of the model to be used to generate the response. The parameter contains the ID of a Yandex Cloud folder or the ID of a model fine-tuned in DataSphere. |
You can use this method with any methods other than generation.
API endpoint:
API v1alpha | YandexGPT API v1 |
---|---|
https://llm.api.cloud.yandex.net/llm/v1alpha/tokenize |
https://llm.api.cloud.yandex.net/foundationModels/v1/tokenize |
General request structure remains the same:
API v1alpha |
YandexGPT API v1 |
|
|
Request body fields:
API v1alpha |
YandexGPT API v1 |
Description |
model |
modelUri |
ID of the model to be used to generate the response. The parameter contains the ID of a Yandex Cloud folder or the ID of a model fine-tuned in DataSphere. |
Vectorization
API endpoint:
API v1alpha | Embeddings API v1 |
---|---|
https://llm.api.cloud.yandex.net/llm/v1alpha/embedding |
https://llm.api.cloud.yandex.net/foundationModels/v1/textEmbedding |
General request structure:
API v1alpha |
Embeddings API v1 |
|
|
Request body fields:
API v1alpha |
Embeddings API v1 |
Description |
model |
— |
In Embeddings API v1, the |
|
|
Vectorization of short texts, such as search requests, queries, etc. |
|
|
Vectorization of large source texts, e.g., documentation articles. |