Using fine-tuned classifiers based on YandexGPT
To run a request to the classifier of a model fine-tuned in DataSphere, use the classify text classification API method or Yandex Cloud ML SDK.
Getting started
To use the examples:
-
Create a service account and assign the
ai.languageModels.user
role to it. -
Get the service account API key and save it.
The following examples use API key authentication. Yandex Cloud ML SDK also supports IAM token and OAuth token authentication. For more information, see Authentication in Yandex Cloud ML SDK.
-
Use the pip
package manager to install the ML SDK library:pip install yandex-cloud-ml-sdk
-
Get API authentication credentials as described in Authentication with the Yandex Foundation Models API.
-
To use the examples, install cURL
.
Send a request to the classifier
To send a request to the classifier:
-
Create a file named
classify.py
and paste the following code into it:#!/usr/bin/env python3 from __future__ import annotations from yandex_cloud_ml_sdk import YCloudML request_text = "Vieta's formulas" def main(): sdk = YCloudML( folder_id="<folder_ID>", auth="<API_key>", ) model = sdk.models.text_classifiers( "cls://<folder_ID>/<classifier_ID>" ) # The result will contain predictions within predefined classes # and the most weighty prediction will be "mathematics": 0.92 result = model.run(request_text) for prediction in result: print(prediction) if __name__ == "__main__": main()
Where:
-
request_text
: Message text. The total number of tokens per request must not exceed 8,000.As input data for a request, Yandex Cloud ML SDK can accept a string, a dictionary, an object of the
TextMessage
class, or an array containing any combination of these data types. For more information, see Yandex Cloud ML SDK usage.
-
<folder_ID>
: ID of the folder in which the service account was created. -
<API_key>
: Service account API key you got earlier required for authentication in the API.The following examples use API key authentication. Yandex Cloud ML SDK also supports IAM token and OAuth token authentication. For more information, see Authentication in Yandex Cloud ML SDK.
model
: ID of the model that will be used to classify the message. This parameter contains the Yandex Cloud folder ID and the ID of the model tuned in DataSphere.
The names of the classes between which the model will be distributing queries must be specified during model tuning; therefore, they are not provided in the request.
-
-
Run the created file:
python3 classify.py
In response, the service will return the classification results with the
confidence
values for the probability of classifying the query text into each class.
The example below is intended to be run in MacOS and Linux. To run it in Windows, see how to work with Bash in Microsoft Windows.
-
Create a file with the request body, e.g.,
body.json
:{ "modelUri": "cls://<folder_ID>/<classifier_ID>", "text": "<prompt_text>" }
Where:
modelUri
: ID of the model that will be used to classify the message. This parameter contains the Yandex Cloud folder ID and the ID of the model tuned in DataSphere.text
: Message text. The total number of tokens per request must not exceed 8,000.
The names of the classes between which the model will be distributing queries must be specified during model tuning; therefore, they are not provided in the request.
-
Send a request to the classifier by running the following command:
export IAM_TOKEN=<IAM_token> curl \ --request POST \ --header "Authorization: Bearer ${IAM_TOKEN}" \ --data "@<path_to_request_body_file>" \ "https://llm.api.cloud.yandex.net:443/foundationModels/v1/textClassification"
Note
The
https://llm.api.cloud.yandex.net:443/foundationModels/v1/textClassification
endpoint only works with fine-tuned classifiers. For prompt-based classifiers, usehttps://llm.api.cloud.yandex.net/foundationModels/v1/fewShotTextClassification
.In response, the service will return the classification results with the
confidence
values for the probability of classifying the query text into each class:{ "predictions": [ { "label": "<class_1_name>", "confidence": 0.00010150671005249023 }, { "label": "<class_2_name>", "confidence": 0.000008225440979003906 }, ... { "label": "<class_n_name>", "confidence": 0.93212890625 } ], "modelVersion": "<model_version>" }
In multi-class classification, the sum of the confidence
values for all classes is always 1
.
In multi-label classification, the confidence
value for each class is calculated independently (the sum of the values is not equal to 1
).
See also
- Classifiers based on YandexGPT
- Examples of working with ML SDK on GitHub