Classifiers based on YandexGPT
The YandexGPT based classifier functionality is at the Preview stage.
Yandex Foundation Models allows classifying the text requests provided in prompts. Classification in YandexGPT-based models is implemented in the Foundation Models Text Classification API.
There are three types of classification available in Foundation Models:
- Binary classification puts a request into one of two possible classes, such as spam
or non-spam. - Multi-class classification puts a request into one (and only one) of more than two classes. For example, a computer CPU can belong to one generation only.
- Multi-label classification allows putting a request into a number of different non-mutually exclusive classes at the same time. For example, multiple hashtags
can belong to the same post on social media at the same time.
Classification models are only available in synchronous mode.
Foundation Models provides YandexGPT classifiers of these two types: prompt-based and trainable.
Prompt-based classifiers
Foundation Models prompt-based classifiers support binary and multi-class classification, require no model tuning, and are prompt-controlled. The fewShotClassify Text Classification API method enables using these two prompt-based classifiers: Zero-shot and Few-shot. You can provide 2 to 20 classes to the fewShotClassify
method.
Tip
Give meaningful names to label
classes: this is essential for correct classification results. For example, use chemistry
and physics
rather than chm
and phs
for class names.
Zero-shot classifier
The Zero-shot classifier allows to perform binary and multi-class classification by providing only the model ID, task description, request text, and an array of class names in the request body.
Request body format for the Zero-shot classifier:
{
"modelUri": "string",
"taskDescription": "string",
"labels": [
"string",
"string",
...
"string"
],
"text": "string"
}
Where:
-
modelUri
: ID of the model that will be used to classify the message. The parameter contains Yandex Cloud folder ID. -
taskDescription
: Text description of the task for the classifier. -
labels
: Array of classes.Give meaningful names to
label
classes: this is essential for correct classification results. For example, usechemistry
andphysics
rather thanchm
andphs
for class names. -
text
: Text content of the message.
Use the https://llm.api.cloud.yandex.net/foundationModels/v1/fewShotTextClassification
endpoint for requests to Zero-shot classifiers.
Few-shot classifier
The Few-shot classifier enables binary and multi-class classification by providing the model with an array of sample requests for the classes specified in the labels
field. Sample requests are provided in the samples
field of the request body, improving the classifier result accuracy.
Request body format for the Few-shot classifier:
{
"modelUri": "string",
"taskDescription": "string",
"labels": [
"string",
"string",
...
"string"
],
"text": "string",
"samples": [
{
"text": "string",
"label": "string"
},
{
"text": "string",
"label": "string"
},
...
{
"text": "string",
"label": "string"
}
]
}
Where:
-
modelUri
: ID of the model that will be used to classify the message. The parameter contains Yandex Cloud folder ID. -
taskDescription
: Text description of the task for the classifier. -
labels
: Array of classes.Give meaningful names to
label
classes: this is essential for correct classification results. For example, usechemistry
andphysics
rather thanchm
andphs
for class names. -
text
: Text content of the message. -
samples
: Array of sample requests for the classes specified in thelabels
field. Sample requests are provided as objects, each one containing one text request sample and the class to which such request should belong.
Use the https://llm.api.cloud.yandex.net/foundationModels/v1/fewShotTextClassification
endpoint for requests to Few-shot classifiers.
Warning
You can deliver multiple classification examples in a single request. All examples in the request must not exceed 6,000 tokens.
Trainable classifiers
If you are not satisfied with the output quality of the Zero-shot and Few-shot classifiers, tune your own one based on YandexGPT in Yandex DataSphere. Trainable classifiers can be trained to offer all supported classification types.
To run a request to the classifier of a model fine-tuned in DataSphere, use the classify Text Classification API method. If you do so, you only need to provide the model ID and the request text to the model. The names of the classes between which the model will be distributing requests must be specified during model tuning and are not provided in the request.
Request body format for the classifier of a model fine-tuned in DataSphere:
{
"modelUri": "string",
"text": "string"
}
Where:
modelUri
: ID of the model that will be used to classify the message. The parameter contains Yandex Cloud folder ID and the ID of the model tuned in DataSphere.text
: Message text. The total number of tokens per request must not exceed 8,000.
Use the https://llm.api.cloud.yandex.net:443/foundationModels/v1/textClassification
endpoint for requests to trainable classifiers.
The names of the classes between which the model will be distributing requests must be specified during model tuning and are not provided in the request.
Response format
All Foundation Models classifier types return the result in the following format:
{
"predictions": [
{
"label": "string",
"confidence": "number",
},
{
"label": "string",
"confidence": "number",
},
...
{
"label": "string",
"confidence": "number",
}
],
"modelVersion": "string"
}
Where:
-
label
: Class name. -
confidence
: Probability of the request text belonging to this class.In multi-class classification, the sum of the
confidence
values for all classes is always1
.In multi-label classification, the
confidence
value for each class is calculated independently (the sum of the values is not equal to1
).