Migration from API v1alpha to YandexGPT API v1 and Embeddings API v1

Written by

Updated at June 6, 2024

Model selection
Generation
Tokenization
Vectorization

API v1alpha is deprecated and discontinued. To work with YandexGPT, use YandexGPT API v1 and Embeddings API v1. In the new API version, the maximum total number of tokens allowed per user request and model response is 8192.

If your product adopts methods of the deprecated API, migrate it to the new interface. See the detailed overview of the changes required for the REST API below. Similar changes apply to the gRPC API.

Model selection

In YandexGPT API v1 and Embeddings API v1, specify the model name in the modelUri parameter (instead of model in the deprecated API):

Model	API v1alpha	YandexGPT API v1 and Embeddings API v1
YandexGPT Pro	`"model": "yagpt-2.0:hq"`	`"modelUri": "gpt://<folder_ID>/yandexgpt/latest"`
YandexGPT Lite	`"model": "general"`	`"modelUri": "gpt://<folder_ID>/yandexgpt-lite/latest"`

Generation

Synchronous mode

Asynchronous mode

TextGeneration.instruct (prompt mode)

API endpoint:

API v1alpha	YandexGPT API v1
`https://llm.api.cloud.yandex.net/llm/v1alpha/instruct`	`https://llm.api.cloud.yandex.net/foundationModels/v1/completion`

General request structure:

API v1alpha

YandexGPT API v1

{
  "model": "string",
  "generationOptions": {
    "partialResults": true,
    "temperature": "number",
    "maxTokens": "integer"
  },

  // only one of the fields: `instructionText` or `instructionUri`
  "instructionText": "string",
  "instructionUri": "string",

  "requestText": "string"
}

{
  "modelUri": "string",
  "completionOptions": {
    "stream": true,
    "temperature": "number",
    "maxTokens": "integer"
  },
  "messages": [
    {
      "role": "string",
      "text": "string"
    }
  ]
}

Request body fields:

API v1alpha	YandexGPT API v1	Description
model	modelUri	ID of the model to use for response generation. The parameter contains the ID of a Yandex Cloud folder or the ID of a model fine-tuned in DataSphere.
instructionText	`"messages": [ { "role": "system", "text": "string" } ]`	In YandexGPT API v1, the `messages` section is the list of messages setting the request context for the model. `role`: When the value is `system`, it allows you to set the request context and define the model's behavior. `text`: Text that defines the request context.
instructionUri	modelUri	YandexGPT API v1 does not use the `instructionUri` parameter; to specify the URI, use `modelUri`.
requestText	`"messages": [ { "role": "user", "text": "string" } ]`	In YandexGPT API v1, the `messages` section is the list of messages setting the request context for the model. `role`: When the value is `user`, it allows sending user messages to the model. `text`: Text message of the request.
partialResults	stream	It enables streaming of partially generated text. It may take either the `true` or `false` value.
generationOptions	completionOptions	It sets the request configuration parameters.
maxTokens	maxTokens	The `maxTokens` parameter name has remained the same, but its value has changed. In API v1alpha, the `maxTokens` parameter was limiting the total number of tokens in both request and response. In YandexGPT API v1, `maxTokens` is the maximum number of tokens only in response.

TextGeneration.chat (chat mode)

API endpoint:

API v1alpha	YandexGPT API v1
`https://llm.api.cloud.yandex.net/llm/v1alpha/chat`	`https://llm.api.cloud.yandex.net/foundationModels/v1/completion`

General request structure:

API v1alpha

YandexGPT API v1

{
  "model": "string",
  "generationOptions": {
    "partialResults": true,
    "temperature": "number",
    "maxTokens": "integer"
  },
  "messages": [
    {
      "role": "string",
      "text": "string"
    }
  ],
  "instructionText": "string"
}

{
  "modelUri": "string",
  "completionOptions": {
    "stream": true,
    "temperature": "number",
    "maxTokens": "integer"
  },
  "messages": [
    {
      "role": "string",
      "text": "string"
    }
  ]
}

Request body fields:

API v1alpha	YandexGPT API v1	Description
model	modelUri	ID of the model to be used to generate the response. The parameter contains the ID of a Yandex Cloud folder or the ID of a model fine-tuned in DataSphere.
instructionText	`"messages": [ { "role": "system", "text": "string" } ]`	In YandexGPT API v1, the `messages` section is the list of messages setting the request context for the model. `role`: When the value is `system`, it allows you to set the request context and define the model's behavior. `text`: Text that defines the request context.
partialResults	stream	It enables streaming of partially generated text. It may take either the `true` or `false` value.
generationOptions	completionOptions	It sets the request configuration parameters.
maxTokens	maxTokens	The `maxTokens` parameter name has remained the same, but the value has changed. In API v1alpha, the `maxTokens` parameter was limiting the total number of tokens in both request and response. In YandexGPT API v1, `maxTokens` is the maximum number of tokens only in response.
role	role	The name of the `role` parameter has remained the same, but the list of possible values has changed. In API v1alpha, the possible values of the parameter were `Assistant` and `User`. In YandexGPT API v1, the possible values of the parameter are `assistant`, `user`, and `system`.

API endpoint:

API v1alpha	YandexGPT API v1
`https://llm.api.cloud.yandex.net/llm/v1alpha/instructAsync`	`https://llm.api.cloud.yandex.net/foundationModels/v1/completionAsync`

General request structure:

API v1alpha

YandexGPT API v1

{
  "model": "string",
  "generationOptions": {
    "partialResults": true,
    "temperature": "number",
    "maxTokens": "integer"
  },

  // only one of the fields: `instructionText` or `instructionUri`
  "instructionText": "string",
  "instructionUri": "string",

  "requestText": "string"
}

{
  "modelUri": "string",
  "completionOptions": {
    "stream": true,
    "temperature": "number",
    "maxTokens": "integer"
  },
  "messages": [
    {
      "role": "string",
      "text": "string"
    }
  ]
}

Request body fields:

API v1alpha	YandexGPT API v1	Description
model	modelUri	ID of the model to be used to generate the response. The parameter contains the ID of a Yandex Cloud folder or the ID of a model fine-tuned in DataSphere.
instructionText	`"messages": [ { "role": "system", "text": "string" } ]`	In YandexGPT API v1, the `messages` section is the list of messages setting the request context for the model. `role`: When the value is `system`, it allows you to set the request context and define the model's behavior. `text`: Text that defines the request context.
instructionUri	modelUri	YandexGPT API v1 does not use the `instructionUri` parameter; to specify the URI, use `modelUri`.
requestText	`"messages": [ { "role": "user", "text": "string" } ]`	In YandexGPT API v1, the `messages` section is the list of messages setting the request context for the model. `role`: When the value is `user`, it allows sending user messages to the model. `text`: Text message of the request.
partialResults	stream	It enables streaming of partially generated text. It may take either the `true` or `false` value.
generationOptions	completionOptions	It sets the request configuration parameters.
maxTokens	maxTokens	The `maxTokens` parameter name has remained the same, but the value has changed. In API v1alpha, the `maxTokens` parameter was limiting the total number of tokens in both request and response. In YandexGPT API v1, `maxTokens` is the maximum number of tokens only in response.

Tokenization

Tokenizer.tokenizeCompletion

Tokenizer.tokenize

You can use this method with generation methods only.

API endpoint:

API v1alpha	YandexGPT API v1
`https://llm.api.cloud.yandex.net/llm/v1alpha/tokenize`	`https://llm.api.cloud.yandex.net/foundationModels/v1/tokenizeCompletion`

General request structure:

API v1alpha

YandexGPT API v1

{
  "model": "string",
  "text": "string"
}

{
  "modelUri": "string",
  "completionOptions": {
    "stream": true,
    "temperature": "number",
    "maxTokens": "integer"
  },
  "messages": [
    {
      "role": "string",
      "text": "string"
    }
  ]
}

Request body fields:

API v1alpha	YandexGPT API v1	Description
model	modelUri	ID of the model to be used to generate the response. The parameter contains the ID of a Yandex Cloud folder or the ID of a model fine-tuned in DataSphere.

You can use this method with any methods other than generation.

API endpoint:

API v1alpha	YandexGPT API v1
`https://llm.api.cloud.yandex.net/llm/v1alpha/tokenize`	`https://llm.api.cloud.yandex.net/foundationModels/v1/tokenize`

General request structure remains the same:

API v1alpha	YandexGPT API v1
`{ "model": "string", "text": "string" }`	`{ "modelUri": "string", "text": "string" }`

Request body fields:

API v1alpha	YandexGPT API v1	Description
model	modelUri	ID of the model to be used to generate the response. The parameter contains the ID of a Yandex Cloud folder or the ID of a model fine-tuned in DataSphere.

Vectorization

API endpoint:

API v1alpha	Embeddings API v1
`https://llm.api.cloud.yandex.net/llm/v1alpha/embedding`	`https://llm.api.cloud.yandex.net/foundationModels/v1/textEmbedding`

General request structure:

API v1alpha	Embeddings API v1
`{ "embeddingType": "string", "model": "string", "text": "string" }`	`{ "modelUri": "string", "text": "string" }`

Request body fields:

API v1alpha	Embeddings API v1	Description
model	—	In Embeddings API v1, the `modelUri` parameter defines the model for text vectorization.
`"embeddingType" = "EMBEDDING_TYPE_QUERY"`	`"modelUri" = "emb://<folder_ID>/text-search-query/latest"`	Vectorization of short texts, such as search requests, queries, etc.
`"embeddingType" = "EMBEDDING_TYPE_DOCUMENT"`	`"modelUri" = "emb://<folder_ID>/text-search-doc/latest"`	Vectorization of large source texts, e.g., documentation articles.

Migration from API v1alpha to YandexGPT API v1 and Embeddings API v1

Model selectionModel selection

GenerationGeneration

TokenizationTokenization

VectorizationVectorization

Was the article helpful?

Model selection

Generation

Tokenization

Vectorization