Domain

Written by

Updated at September 25, 2025

class yandex_cloud_ml_sdk._models.completions.function.AsyncCompletions
class yandex_cloud_ml_sdk._models.completions.model.AsyncGPTModel

class yandex_cloud_ml_sdk._models.completions.function.AsyncCompletions

A class for handling completions models.

It defines the core functionality for calling a model to generate completions based on the provided model name and version.

__call__(model_name, *, model_version='latest')

Create a model object to call for generating completions.

This method constructs the URI for the model based on the provided name and version. If the name contains ://, it is treated as a full URI. Otherwise, it looks up the model name in the well-known names dictionary. But after this, in any case, we construct a URI in the form gpt://<folder_id>//.

Parameters

model_name (str) – the name or URI of the model to call.
model_version (str) – the version of the model to use. Defaults to ‘latest’.

Return type

ModelTypeT

Model

class yandex_cloud_ml_sdk._models.completions.model.AsyncGPTModel

A class for GPT models providing various functionalities including tuning, and batch processing.

async run(messages, *, timeout=180)

Executes the model with the provided messages.

Parameters

messages (TextMessage | TextMessageDict | TextMessageProtocol | str | FunctionResultMessageDict | Iterable[TextMessage | TextMessageDict | TextMessageProtocol | str | FunctionResultMessageDict]) – the input messages to process. Could be a string, a dictionary, or a result object. Read more about other possible message types in the corresponding documentation.
timeout – the timeout, or the maximum time to wait for the request to complete in seconds. Defaults to 60 seconds.

Return type

GPTModelResult[AsyncToolCall]

async run_stream(messages, *, timeout=180)

Executes the model with the provided messages and yields partial results as they become available.

Parameters

messages (TextMessage | TextMessageDict | TextMessageProtocol | str | FunctionResultMessageDict | Iterable[TextMessage | TextMessageDict | TextMessageProtocol | str | FunctionResultMessageDict]) – the input messages to process.
timeout – the timeout, or the maximum time to wait for the request to complete in seconds. Defaults to 60 seconds.

Return type

AsyncIterator[GPTModelResult[AsyncToolCall]]

async run_deferred(messages, *, timeout=60)

Initiates a deferred execution of the model with the provided messages.

Parameters

messages (TextMessage | TextMessageDict | TextMessageProtocol | str | FunctionResultMessageDict | Iterable[TextMessage | TextMessageDict | TextMessageProtocol | str | FunctionResultMessageDict]) – the input messages to process.
timeout – the timeout, or the maximum time to wait for the request to complete in seconds. Defaults to 60 seconds.

Return type

AsyncOperation[GPTModelResult[AsyncToolCall]]

async attach_deferred(operation_id, timeout=60)

Attaches to an ongoing deferred operation using its operation id.

Parameters

operation_id (str) – the id of the deferred operation to attach to.
timeout (float) – the timeout, or the maximum time to wait for the request to complete in seconds. Defaults to 60 seconds.

Return type

AsyncOperation[GPTModelResult[AsyncToolCall]]

async tokenize(messages, *, timeout=60)

Tokenizes the provided messages into a tuple of tokens.

Parameters

messages (TextMessage | TextMessageDict | TextMessageProtocol | str | FunctionResultMessageDict | Iterable[TextMessage | TextMessageDict | TextMessageProtocol | str | FunctionResultMessageDict]) – the input messages to tokenize.
timeout – the timeout, or the maximum time to wait for the request to complete in seconds. Defaults to 60 seconds.

Return type

tuple[Token, …]

async tune_deferred(train_datasets, *, validation_datasets=Undefined, name=Undefined, description=Undefined, labels=Undefined, seed=Undefined, lr=Undefined, n_samples=Undefined, additional_arguments=Undefined, tuning_type=Undefined, scheduler=Undefined, optimizer=Undefined, timeout=60)

Initiate a deferred tuning process for the model.

Parameters

train_datasets (str | BaseDataset | tuple[str | BaseDataset, float] | Iterable[str | BaseDataset | tuple[str | BaseDataset, float]] | dict[str | BaseDataset, float]) – the dataset objects and/or dataset ids used for training of the model.
validation_datasets (str | BaseDataset | tuple[str | BaseDataset, float] | Iterable[str | BaseDataset | tuple[str | BaseDataset, float]] | dict[str | BaseDataset, float] | Undefined) – the dataset objects and/or dataset ids used for validation of the model.
name (str | Undefined) – the name of the tuning task.
description (str | Undefined) – the description of the tuning task.
labels (dict[str, str] | Undefined) – labels for the tuning task.
seed (int | Undefined) – a random seed for reproducibility.
lr (float | Undefined) – a learning rate for tuning.
n_samples (int | Undefined) – a number of samples for tuning.
additional_arguments (str | Undefined) – additional arguments for tuning.
tuning_type (BaseTuningType | Undefined) – a type of tuning to be applied.
scheduler (BaseScheduler | Undefined) – a scheduler for tuning.
optimizer (BaseOptimizer | Undefined) – an optimizer for tuning.
timeout (float) – the timeout, or the maximum time to wait for the request to complete in seconds. Defaults to 60 seconds.

Return type

AsyncTuningTask[AsyncGPTModel]

async tune(train_datasets, *, validation_datasets=Undefined, name=Undefined, description=Undefined, labels=Undefined, seed=Undefined, lr=Undefined, n_samples=Undefined, additional_arguments=Undefined, tuning_type=Undefined, scheduler=Undefined, optimizer=Undefined, timeout=60, poll_timeout=259200, poll_interval=60)

Tune the model with the specified training datasets and parameters.

Parameters

train_datasets (str | BaseDataset | tuple[str | BaseDataset, float] | Iterable[str | BaseDataset | tuple[str | BaseDataset, float]] | dict[str | BaseDataset, float]) – the dataset objects and/or dataset ids used for training of the model.
validation_datasets (str | BaseDataset | tuple[str | BaseDataset, float] | Iterable[str | BaseDataset | tuple[str | BaseDataset, float]] | dict[str | BaseDataset, float] | Undefined) – the dataset objects and/or dataset ids used for validation of the model.
name (str | Undefined) – the name of the tuning task.
description (str | Undefined) – the description of the tuning task.
labels (dict[str, str] | Undefined) – labels for the tuning task.
seed (int | Undefined) – a random seed for reproducibility.
lr (float | Undefined) – a learning rate for tuning.
n_samples (int | Undefined) – a number of samples for tuning.
additional_arguments (str | Undefined) – additional arguments for tuning.
tuning_type (BaseTuningType | Undefined) – a type of tuning to be applied.
scheduler (BaseScheduler | Undefined) – a scheduler for tuning.
optimizer (BaseOptimizer | Undefined) – an optimizer for tuning.
timeout (float) – the timeout, or the maximum time to wait for the request to complete in seconds. Defaults to 60 seconds.
poll_timeout (int) – the maximum time to wait while polling for completion of the tuning task. Defaults to 259200 seconds (72 hours).
poll_interval (float) – the interval between polling attempts during the tuning process. Defaults to 60 seconds.

Return type

Self

async attach_tune_deferred(task_id, *, timeout=60)

Attach a deferred tuning task using its task id.

Parameters

task_id (str) – the id of the deferred tuning task to attach to.
timeout (float) – the timeout, or the maximum time to wait for the request to complete in seconds. Defaults to 60 seconds.

Return type

AsyncTuningTask[AsyncGPTModel]

property batch: BatchSubdomainTypeT

property config: ConfigTypeT

configure(*, temperature=Undefined, max_tokens=Undefined, reasoning_mode=Undefined, response_format=Undefined, tools=Undefined, parallel_tool_calls=Undefined, tool_choice=Undefined)

Configures the model with specified parameters.

Parameters

temperature (float | Undefined) – a sampling temperature to use - higher values mean more random results. Should be a double number between 0 (inclusive) and 1 (inclusive).
max_tokens (int | Undefined) – a maximum number of tokens to generate in the response.
reasoning_mode (int | str | ReasoningMode | Undefined) – the mode of reasoning to apply during generation, allowing the model to perform internal reasoning before responding. Read more about possible modes in the reasoning documentation.
response_format (Literal['json'] | ~yandex_cloud_ml_sdk._types.schemas.JsonSchemaResponseType | type | ~yandex_cloud_ml_sdk._types.misc.Undefined) – a format of the response returned by the model. Could be a JsonSchema, a JSON string, or a pydantic model. Read more about possible response formats in the structured output documentation_BaseGPTModel_URL.
tools (Sequence[FunctionTool] | FunctionTool | Undefined) – tools to use for completion. Can be a sequence or a single tool.
parallel_tool_calls (bool | Undefined) – whether to allow parallel calls to tools during completion. Defaults to true.
tool_choice (Literal['none', 'None', 'NONE', 'auto', 'Auto', 'AUTO', 'required', 'Required', 'REQUIRED'] | ~yandex_cloud_ml_sdk._types.tools.function.FunctionDictType | ~yandex_cloud_ml_sdk._tools.tool.FunctionTool | ~yandex_cloud_ml_sdk._types.misc.Undefined) – the strategy for choosing tools. There are several ways to configure tool_choice for query processing: - no tools to call (tool_choice='none'); - required to call any tool (tool_choice='required'); - call a specific tool (tool_choice={'type': 'function', 'function': {'name': 'another_calculator'}} or directly passing a tool object).

Returns

new model instance with provided configuration.

Return type

Self

langchain(model_type='chat', timeout=60)

Initializes a langchain model based on the specified model type.

Parameters

model_type (Literal['chat']) – the type of langchain model to initialize. Defaults to "chat".
timeout (int) – the timeout which sets the default for the langchain model object. Defaults to 60 seconds.

Return type

BaseYandexLanguageModel

property uri: str

Domain

class yandexcloudmlsdk.models.completions.function.AsyncCompletionsclass yandex_cloud_ml_sdk._models.completions.function.AsyncCompletions

Model

class yandexcloudmlsdk.models.completions.model.AsyncGPTModelclass yandex_cloud_ml_sdk._models.completions.model.AsyncGPTModel

Was the article helpful?

class yandex_cloud_ml_sdk._models.completions.function.AsyncCompletions

class yandex_cloud_ml_sdk._models.completions.model.AsyncGPTModel