REST: Create chat completion
- HTTP request
- Body parameters
- ResponseFormatText
- ResponseFormatJsonSchema
- JsonSchema
- ResponseFormatJsonObject
- PredictionContent
- ChatCompletionRequestMessageContentPartText
- ChatCompletionStreamOptions0
- ChatCompletionAllowedToolsChoice
- ChatCompletionAllowedTools
- ChatCompletionNamedToolChoice
- Function
- ChatCompletionNamedToolChoiceCustom
- Custom
- ChatCompletionFunctionCallOption
- ChatCompletionFunctions
- Response
- ChoicesItem
- ChatCompletionResponseMessage
- AnnotationsItem
- UrlCitation
- Audio0
- Logprobs0
- ChatCompletionTokenLogprob
- TopLogprobsItem
- CompletionUsage
- CompletionTokensDetails
- PromptTokensDetails
- ChoicesItem
- ChatCompletionStreamResponseDelta
- ChatCompletionMessageToolCallChunk
- Function
- Logprobs
- ChatCompletionTokenLogprob
- TopLogprobsItem
- CompletionUsage
- CompletionTokensDetails
- PromptTokensDetails
Starting a new project? We recommend trying Responses.
Parameter support can differ depending on the model used to generate the
response, particularly for newer reasoning models. Parameters that are only
supported for reasoning models are noted below.
HTTP request
POST https://ai.api.cloud.yandex.net/v1/chat/completions
Body parameters
Request schema: application/json
{
"<allOf>": [
"unknown",
{
"messages": [
{
"<anyOf>": [
{
"content": "unknown",
"role": "string",
"name": "string"
},
{
"content": "unknown",
"role": "string",
"name": "string"
},
{
"content": "unknown",
"role": "string",
"name": "string"
},
{
"content": "unknown",
"refusal": "unknown",
"role": "string",
"name": "string",
"audio": "unknown",
"tool_calls": [
{
"<anyOf>": [
{
"id": "string",
"type": "string",
"function": {
"name": "string",
"arguments": "string"
}
},
{
"id": "string",
"type": "string",
"custom": {
"name": "string",
"input": "string"
}
}
]
}
],
"function_call": "unknown"
},
{
"role": "string",
"content": "unknown",
"tool_call_id": "string"
},
{
"role": "string",
"content": "unknown",
"name": "string"
}
]
}
],
"model": "unknown",
"modalities": "unknown",
"verbosity": "unknown",
"reasoning_effort": "unknown",
"max_completion_tokens": "integer",
"frequency_penalty": "number",
"presence_penalty": "number",
"web_search_options": {},
"top_logprobs": "integer",
"response_format": "unknown",
"audio": {},
"store": "boolean",
"stream": "boolean",
"stop": "unknown",
"logit_bias": {
"string": "integer"
},
"logprobs": "boolean",
"max_tokens": "integer",
"n": "integer",
"prediction": "unknown",
"seed": "integer",
"stream_options": "unknown",
"tools": [
{
"<anyOf>": [
{
"type": "string",
"function": {
"description": "string",
"name": "string",
"parameters": "object",
"strict": "unknown"
}
},
{
"type": "string",
"custom": {
"name": "string",
"description": "string",
"format": "unknown"
}
}
]
}
],
"tool_choice": "unknown",
"parallel_tool_calls": "boolean",
"function_call": "unknown",
"functions": [
{
"description": "string",
"name": "string",
"parameters": "object"
}
]
}
]
}
|
Field |
Description |
|
messages[] |
unknown Required field. |
|
model |
Any of string | enum |
|
modalities |
Any of enum | null |
|
verbosity |
Any of enum | null |
|
reasoning_effort |
Any of enum | null |
|
max_completion_tokens |
integer An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens. |
|
frequency_penalty |
number NOT SUPPORTED BY ALL MODELS. between -2.0 and 2.0. Positive values penalize new tokens based on |
|
presence_penalty |
number NOT SUPPORTED BY ALL MODELS. Number between -2.0 and 2.0. Positive values penalize new tokens based on |
|
web_search_options |
[CURRENTLY NOT SUPPORTED] |
|
top_logprobs |
integer NOT SUPPORTED BY ALL MODELS. An integer between 0 and 20 specifying the number of most likely tokens to |
|
response_format |
Any of ResponseFormatText | ResponseFormatJsonSchema | ResponseFormatJsonObject |
|
audio |
[CURRENTLY NOT SUPPORTED]. |
|
store |
boolean [CURRENTLY NOT SUPPORTED] |
|
stream |
boolean If set to true, the model response data will be streamed to the client |
|
stop |
unknown [CURRENTLY NOT SUPPORTED] |
|
logit_bias |
object (map<string, integer>) |
|
logprobs |
boolean NOT ALL MODELS SUPPORTED. Whether to return log probabilities of the output tokens or not. If true, |
|
max_tokens |
integer The maximum number of tokens that can be generated in the This value is now deprecated in favor of |
|
n |
integer NOT SUPPPORTED BY ALL MODELS. How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep |
|
prediction |
Any of PredictionContent |
|
seed |
integer CURRENTLY NOT SUPPORTED. |
|
stream_options |
Any of ChatCompletionStreamOptions0 | null |
|
tools[] |
unknown |
|
tool_choice |
Any of enum | ChatCompletionAllowedToolsChoice | ChatCompletionNamedToolChoice | ChatCompletionNamedToolChoiceCustom |
|
parallel_tool_calls |
boolean NOT SUPPORTED BY ALL MODELS. To enable parallel function calling during tool use. |
|
function_call |
Any of enum | ChatCompletionFunctionCallOption |
|
functions[] |
Required field. |
ResponseFormatText
Default response format. Used to generate text responses.
|
Field |
Description |
|
type |
enum Required field. The type of response format being defined. Always
|
ResponseFormatJsonSchema
JSON Schema response format. Used to generate structured JSON responses.'
|
Field |
Description |
|
type |
enum Required field. The type of response format being defined. Always
|
|
json_schema |
Required field. Structured Outputs configuration options, including a JSON Schema. |
JsonSchema
Structured Outputs configuration options, including a JSON Schema.
|
Field |
Description |
|
description |
string A description of what the response format is for, used by the model to |
|
name |
string Required field. The name of the response format. Must be a-z, A-Z, 0-9, or contain |
|
schema |
object The schema for the response format, described as a JSON Schema object. |
|
strict |
Any of boolean | null |
ResponseFormatJsonObject
JSON object response format. An older method of generating JSON responses.
Using json_schema is recommended for models that support it. Note that the
model will not generate JSON without a system or user message instructing it
to do so.
|
Field |
Description |
|
type |
enum Required field. The type of response format being defined. Always
|
PredictionContent
Static predicted output content, such as the content of a text file that is
being regenerated.
|
Field |
Description |
|
type |
enum Required field. The type of the predicted content you want to provide. This type is
|
|
content |
Any of string | ChatCompletionRequestMessageContentPartText |
ChatCompletionRequestMessageContentPartText
Text input
|
Field |
Description |
|
type |
enum Required field. The type of the content part.
|
|
text |
string Required field. The text content. |
ChatCompletionStreamOptions0
Options for streaming response. Only set this when you set stream: true.
|
Field |
Description |
|
include_usage |
boolean [CURRENTLY NOT SUPPORTED] |
|
include_obfuscation |
boolean [CURRENTLY NOT SUPPORTED] |
ChatCompletionAllowedToolsChoice
Constrains the tools available to the model to a pre-defined set.
|
Field |
Description |
|
type |
enum Required field. Allowed tool configuration type. Always
|
|
allowed_tools |
Required field. Constrains the tools available to the model to a pre-defined set. |
ChatCompletionAllowedTools
Constrains the tools available to the model to a pre-defined set.
|
Field |
Description |
|
mode |
enum Required field. Constrains the tools available to the model to a pre-defined set.
|
|
tools[] |
object Required field. A tool definition that the model should be allowed to call. |
ChatCompletionNamedToolChoice
Specifies a tool the model should use. Use to force the model to call a specific function.
|
Field |
Description |
|
type |
enum Required field. For function calling, the type is always
|
|
function |
Required field. |
Function
|
Field |
Description |
|
name |
string Required field. The name of the function to call. |
ChatCompletionNamedToolChoiceCustom
Specifies a tool the model should use. Use to force the model to call a specific custom tool.
|
Field |
Description |
|
type |
enum Required field. For custom tool calling, the type is always
|
|
custom |
Required field. |
Custom
|
Field |
Description |
|
name |
string Required field. The name of the custom tool to call. |
ChatCompletionFunctionCallOption
Specifying a particular function via {"name": "my_function"} forces the model to call that function.
|
Field |
Description |
|
name |
string Required field. The name of the function to call. |
ChatCompletionFunctions
|
Field |
Description |
|
description |
string A description of what the function does, used by the model to choose when and how to call the function. |
|
name |
string Required field. The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64. |
|
parameters |
object The parameters the functions accepts, described as a JSON Schema object. |
Response
HTTP Code: 200
OK
{
"id": "string",
"choices": [
{
"finish_reason": "string",
"index": "integer",
"message": {
"content": "unknown",
"refusal": "unknown",
"tool_calls": [
{
"<anyOf>": [
{
"id": "string",
"type": "string",
"function": {
"name": "string",
"arguments": "string"
}
},
{
"id": "string",
"type": "string",
"custom": {
"name": "string",
"input": "string"
}
}
]
}
],
"annotations": [
{
"type": "string",
"url_citation": {
"end_index": "integer",
"start_index": "integer",
"url": "string",
"title": "string"
}
}
],
"role": "string",
"function_call": "unknown",
"audio": "unknown"
},
"logprobs": "unknown"
}
],
"created": "integer",
"model": "string",
"service_tier": "unknown",
"system_fingerprint": "string",
"object": "string",
"usage": {
"completion_tokens": "integer",
"prompt_tokens": "integer",
"total_tokens": "integer",
"completion_tokens_details": {
"accepted_prediction_tokens": "integer",
"audio_tokens": "integer",
"reasoning_tokens": "integer",
"rejected_prediction_tokens": "integer"
},
"prompt_tokens_details": {
"audio_tokens": "integer",
"cached_tokens": "integer"
}
}
}
Represents a chat completion response returned by model, based on the provided input.
|
Field |
Description |
|
id |
string Required field. A unique identifier for the chat completion. |
|
choices[] |
Required field. |
|
created |
integer Required field. The Unix timestamp (in seconds) of when the chat completion was created. |
|
model |
string Required field. The model used for the chat completion. |
|
service_tier |
unknown [CURRENTLY NOT SUPPORTED] |
|
system_fingerprint |
string This fingerprint represents the backend configuration that the model runs with. Can be used in conjunction with the |
|
object |
enum Required field. The object type, which is always
|
|
usage |
Required field. Usage statistics for the completion request. |
ChoicesItem
|
Field |
Description |
|
finish_reason |
enum Required field. The reason the model stopped generating tokens. This will be
|
|
index |
integer Required field. The index of the choice in the list of choices. |
|
message |
Required field. A chat completion message generated by the model. |
|
logprobs |
Any of Logprobs0 | null |
ChatCompletionResponseMessage
A chat completion message generated by the model.
|
Field |
Description |
|
content |
Any of string | null |
|
refusal |
Any of string | null |
|
tool_calls[] |
unknown |
|
annotations[] |
Required field. A URL citation when using web search. |
|
role |
enum Required field. The role of the author of this message.
|
|
function_call |
unknown DEPRECATED - This field is deprecated and will be removed in a future version. Use Controls which (if any) function is called by the model.
|
|
audio |
Any of Audio0 | null |
AnnotationsItem
A URL citation when using web search.
|
Field |
Description |
|
type |
enum Required field. The type of the URL citation. Always
|
|
url_citation |
Required field. A URL citation when using web search. |
UrlCitation
A URL citation when using web search.
|
Field |
Description |
|
end_index |
integer Required field. The index of the last character of the URL citation in the message. |
|
start_index |
integer Required field. The index of the first character of the URL citation in the message. |
|
url |
string Required field. The URL of the web resource. |
|
title |
string Required field. The title of the web resource. |
Audio0
If the audio output modality is requested, this object contains data about the audio response from the model.'
|
Field |
Description |
|
id |
string Required field. Unique identifier for this audio response. |
|
expires_at |
integer Required field. The Unix timestamp (in seconds) for when this audio response will |
|
data |
string Required field. Base64 encoded audio bytes generated by the model, in the format |
|
transcript |
string Required field. Transcript of the audio generated by the model. |
Logprobs0
Log probability information for the choice.
|
Field |
Description |
|
content |
Any of ChatCompletionTokenLogprob | null |
|
refusal |
Any of ChatCompletionTokenLogprob | null |
ChatCompletionTokenLogprob
|
Field |
Description |
|
token |
string Required field. The token. |
|
logprob |
number Required field. The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value |
|
bytes |
Any of integer | null |
|
top_logprobs[] |
Required field. |
TopLogprobsItem
|
Field |
Description |
|
token |
string Required field. The token. |
|
logprob |
number Required field. The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value |
|
bytes |
Any of integer | null |
CompletionUsage
Usage statistics for the completion request.
|
Field |
Description |
|
completion_tokens |
integer Required field. Number of tokens in the generated completion. |
|
prompt_tokens |
integer Required field. Number of tokens in the prompt. |
|
total_tokens |
integer Required field. Total number of tokens used in the request (prompt + completion). |
|
completion_tokens_details |
Breakdown of tokens used in a completion. |
|
prompt_tokens_details |
Breakdown of tokens used in the prompt. |
CompletionTokensDetails
Breakdown of tokens used in a completion.
|
Field |
Description |
|
accepted_prediction_tokens |
integer When using Predicted Outputs, the number of tokens in the |
|
audio_tokens |
integer Audio input tokens generated by the model. |
|
reasoning_tokens |
integer Tokens generated by the model for reasoning. |
|
rejected_prediction_tokens |
integer When using Predicted Outputs, the number of tokens in the |
PromptTokensDetails
Breakdown of tokens used in the prompt.
|
Field |
Description |
|
audio_tokens |
integer Audio input tokens present in the prompt. |
|
cached_tokens |
integer Cached tokens present in the prompt. |
Represents a streamed chunk of a chat completion response returned
by the model, based on the provided input.
|
Field |
Description |
|
id |
string Required field. A unique identifier for the chat completion. Each chunk has the same ID. |
|
choices[] |
Required field. |
|
created |
integer Required field. The Unix timestamp (in seconds) of when the chat completion was created. Each chunk has the same timestamp. |
|
model |
string Required field. The model to generate the completion. |
|
service_tier |
unknown [CURRENTLY NOT SUPPORTED] |
|
system_fingerprint |
string This fingerprint represents the backend configuration that the model runs with. |
|
object |
enum Required field. The object type, which is always
|
|
usage |
Required field. Usage statistics for the completion request. |
ChoicesItem
|
Field |
Description |
|
delta |
ChatCompletionStreamResponseDelta Required field. A chat completion delta generated by streamed model responses. |
|
logprobs |
Required field. Log probability information for the choice. |
|
finish_reason |
enum Required field. The reason the model stopped generating tokens. This will be
|
|
index |
integer Required field. The index of the choice in the list of choices. |
ChatCompletionStreamResponseDelta
A chat completion delta generated by streamed model responses.
|
Field |
Description |
|
content |
Any of string | null |
|
function_call |
unknown DEPRECATED - This field is deprecated and will be removed in a future version. Use Controls which (if any) function is called by the model.
|
|
tool_calls[] |
ChatCompletionMessageToolCallChunk Required field. |
|
role |
enum The role of the author of this message.
|
|
refusal |
Any of string | null |
ChatCompletionMessageToolCallChunk
|
Field |
Description |
|
index |
integer Required field. |
|
id |
string The ID of the tool call. |
|
type |
enum The type of the tool. Currently, only
|
|
function |
Function
|
Field |
Description |
|
name |
string The name of the function to call. |
|
arguments |
string The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function. |
Logprobs
Log probability information for the choice.
|
Field |
Description |
|
content[] |
Required field. |
|
refusal[] |
Required field. |
ChatCompletionTokenLogprob
|
Field |
Description |
|
token |
string Required field. The token. |
|
logprob |
number Required field. The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value |
|
bytes |
Any of integer | null |
|
top_logprobs[] |
Required field. |
TopLogprobsItem
|
Field |
Description |
|
token |
string Required field. The token. |
|
logprob |
number Required field. The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value |
|
bytes |
Any of integer | null |
CompletionUsage
Usage statistics for the completion request.
|
Field |
Description |
|
completion_tokens |
integer Required field. Number of tokens in the generated completion. |
|
prompt_tokens |
integer Required field. Number of tokens in the prompt. |
|
total_tokens |
integer Required field. Total number of tokens used in the request (prompt + completion). |
|
completion_tokens_details |
Breakdown of tokens used in a completion. |
|
prompt_tokens_details |
Breakdown of tokens used in the prompt. |
CompletionTokensDetails
Breakdown of tokens used in a completion.
|
Field |
Description |
|
accepted_prediction_tokens |
integer When using Predicted Outputs, the number of tokens in the |
|
audio_tokens |
integer Audio input tokens generated by the model. |
|
reasoning_tokens |
integer Tokens generated by the model for reasoning. |
|
rejected_prediction_tokens |
integer When using Predicted Outputs, the number of tokens in the |
PromptTokensDetails
Breakdown of tokens used in the prompt.
|
Field |
Description |
|
audio_tokens |
integer Audio input tokens present in the prompt. |
|
cached_tokens |
integer Cached tokens present in the prompt. |