Running a model in batch mode
Getting started
You can start working from the management console right away.
-
Create a service account and assign the
ai.editor
role to it. -
Get the service account API key and save it.
The following examples use API key authentication. Yandex Cloud ML SDK also supports IAM token and OAuth token authentication. For more information, see Authentication in Yandex Cloud ML SDK.
-
Use the pip
package manager to install the ML SDK library:pip install yandex-cloud-ml-sdk
Prepare data
- Prepare data to run the model. Depending on your task and model, it can be
TextTextToTextGenerationRequest
for text generation orImageTextToTextGenerationRequest
for vision language models. - Create a dataset in any convenient way. You can also create a dataset later when running the model.
Run the model
- In the management console
, select the folder for which your account has theai.playground.user
andai.datasets.editor
roles or higher. - In the list of services, select Foundation Models.
- In the left-hand panel, click
and select Batch jobs. - Click Run.
- Select a model to run.
- Add a dataset: select an existing one or upload a new file.
- Set the model temperature.
- Click Run.
-
Create a file named
batch-run.py
and add the following code to it:#!/usr/bin/env python3 from __future__ import annotations import pathlib from yandex_cloud_ml_sdk import YCloudML PATH = pathlib.Path(__file__) NAME = f'example-{PATH.parent.name}-{PATH.name}' def local_path(path: str) -> pathlib.Path: return pathlib.Path(__file__).parent / path def main() -> None: sdk = YCloudML( folder_id="<folder_ID>", auth="<API_key>", ) sdk.setup_default_logging() model = sdk.models.completions('<model_URI>') # The batch run will return an _Operations_ object # You can monitor its status or call the .wait method operation = model.batch.run_deferred("<dataset_ID>") resulting_dataset = operation.wait() # A dataset with results will return in Parquet format try: import pyarrow print('Resulting dataset lines:') for line in resulting_dataset.read(): print(line) except ImportError: print('skipping dataset read; install yandex-cloud-ml-sdk[datasets] to be able to read') if __name__ == '__main__': main()
Where:
<folder_ID>
: ID of the folder the service account was created in.<API_key>
: Service account API key you got earlier required for authentication in the API.
The following examples use API key authentication. Yandex Cloud ML SDK also supports IAM token and OAuth token authentication. For more information, see Authentication in Yandex Cloud ML SDK.
<model_URI>
: ID of the model to run. Text generation and vision language models are supported.<dataset_ID>
: ID of the dataset with requests to the model.
-
Run the created file:
python3 batch-run.py
Tip
The model runtime in batch mode depends on the dataset size and may take several days. You can track the current status in the management console