Classifiers based on YandexGPT
The YandexGPT based classifier functionality is at the Preview stage.
Yandex Foundation Models allows you to classify text requests provided in prompts. Classification in YandexGPT-based models is implemented in the Foundation Models Text Classification API.
There are three types of classification available in Foundation Models:
- Binary classification assigns a request to one of two classes, e.g., spam
or non-spam. - Multi-class classification puts a request into one (and only one) of more than two classes. For example, a computer CPU can belong to one generation only.
- Multi-label classification allows you to assign a request to multiple different non-mutually exclusive classes at the same time. For example, a single social media post may have multiple hashtags
at the same time.
Classification models are only available in synchronous mode.
Foundation Models provides classifiers of two types: prompt classifiers based on YandexGPT Lite and YandexGPT Pro and trainable classifiers based on YandexGPT Pro.
Prompt-based classifiers
Foundation Models prompt-based classifiers support binary and multi-class classification, require no model tuning, and are prompt-controlled. The fewShotClassify Text Classification API method enables using these two prompt-based classifiers: Zero-shot and Few-shot. You can provide between 2 and 20 classes to the fewShotClassify
method.
Tip
Give meaningful names to label
classes: this is essential for correct classification results. For example, use chemistry
and physics
rather than chm
and phs
for class names.
Zero-shot classifier
The Zero-shot classifier allows you to perform binary and multi-class classification by providing only the model ID, task description, request text, and an array of class names in the request body.
Request body format for the Zero-shot classifier:
{
"modelUri": "string",
"taskDescription": "string",
"labels": [
"string",
"string",
...
"string"
],
"text": "string"
}
Where:
-
modelUri
: ID of the model that will be used to classify the message. This parameter contains the Yandex Cloud folder ID. -
taskDescription
: Text description of the task for the classifier. -
labels
: Array of classes.Give meaningful names to
label
classes: this is essential for correct classification results. For example, usechemistry
andphysics
rather thanchm
andphs
for class names. -
text
: Message text.
Use the https://llm.api.cloud.yandex.net/foundationModels/v1/fewShotTextClassification
endpoint for requests to Zero-shot classifiers.
Few-shot classifier
The Few-shot classifier enables binary and multi-class classification by providing the model with an array of sample requests for the classes specified in the labels
field. You need to list these sample requests in the samples
field of the request body to get more accurate results from the classifier.
Request body format for the Few-shot classifier:
{
"modelUri": "string",
"taskDescription": "string",
"labels": [
"string",
"string",
...
"string"
],
"text": "string",
"samples": [
{
"text": "string",
"label": "string"
},
{
"text": "string",
"label": "string"
},
...
{
"text": "string",
"label": "string"
}
]
}
Where:
-
modelUri
: ID of the model that will be used to classify the message. This parameter contains the Yandex Cloud folder ID. -
taskDescription
: Text description of the task for the classifier. -
labels
: Array of classes.Give meaningful names to
label
classes: this is essential for correct classification results. For example, usechemistry
andphysics
rather thanchm
andphs
for class names. -
text
: Message text. -
samples
: Array with examples of prompts for the classes specified in thelabels
field. Examples of prompts are provided as objects, each containing an example of a text query and the class to which such query should belong.
Use the https://llm.api.cloud.yandex.net/foundationModels/v1/fewShotTextClassification
endpoint for requests to Few-shot classifiers.
Warning
You can deliver multiple classification examples in a single request. All examples in the request must not exceed 6,000 tokens.
Trainable classifiers
If you are not satisfied with the output quality of the Zero-shot and Few-shot classifiers, tune your own one based on YandexGPT in Yandex DataSphere. Trainable classifiers can be trained to offer all supported classification types.
To run a request to the classifier of a model fine-tuned in DataSphere, use the classify Text Classification API method. If you do so, you only need to provide the model ID and the request text to the model. The names of the classes between which the model will be distributing requests must be specified during model tuning and are not provided in the request.
Request body format for the classifier of a model fine-tuned in DataSphere:
{
"modelUri": "string",
"text": "string"
}
Where:
modelUri
: ID of the model that will be used to classify the message. This parameter contains the Yandex Cloud folder ID and the ID of the model tuned in DataSphere.text
: Message text. The total number of tokens per request must not exceed 8,000.
Use the https://llm.api.cloud.yandex.net:443/foundationModels/v1/textClassification
endpoint for requests to trainable classifiers.
The names of the classes between which the model will be distributing requests must be specified during model tuning and are not provided in the request.
Response format
All Foundation Models classifier types return the result in the following format:
{
"predictions": [
{
"label": "string",
"confidence": "number",
},
{
"label": "string",
"confidence": "number",
},
...
{
"label": "string",
"confidence": "number",
}
],
"modelVersion": "string"
}
Where:
-
label
: Class name. -
confidence
: Probability of assigning the request text to this class.In multi-class classification, the sum of the
confidence
values for all classes is always1
.In multi-label classification, the
confidence
value for each class is calculated independently (the sum of the values is not equal to1
).