Quotas and limits in Yandex Foundation Models
Written by
Updated at November 20, 2024
YandexGPT API has the following limits:
- Quotas are organizational restrictions that can be changed by technical support on request.
- Limits are technical limitations due to Yandex Cloud architectural features. The limits cannot be changed.
If you need more resources, contact support
Quotas
Type of limit | Value |
---|---|
Text vectorization | |
Number of text vectorization requests per second | 10 |
Text generation | |
Number of concurrent generations, synchronous mode | 10 |
Number of concurrent generations, YandexGPT 32k model | 1 |
Number of requests per second, asynchronous mode (request) | 10 |
Number of requests per second, asynchronous mode (getting a response) | 50 |
Number of requests per hour, asynchronous mode (request) | 5000 |
Number of tokenization requests per second | 50 |
Text classification | |
Number of text classification requests per second | 1 |
Image generation | |
Number of generation requests per minute | 500 |
Number of generation requests per day | 5,000 |
Number of result requests per second | 50 |
Limits
Type of limit | Value |
---|---|
Period to store results of asynchronous requests on the server | 3 days |
Text vectorization | |
Number of input tokens | 2,000 |
Output vector size | 256 |
Text generation | |
Maximum number of tokens in response via API | 2,000 |
Maximum number of tokens per response in the management console |
1,000 |
Total number of tokens in request and response, 3rd generation models | 8192 |
Total number of tokens in request and response, synchronous mode of 4th generation models | 8192 |
Total number of tokens in request and response, asynchronous mode of 4th generation models | 32,000 |
Total number of tokens in request and response, YandexGPT 32k model | 32,000 |
Number of free requests per hour for users without a billing account. Available only in the management console | 10 |
Image generation | |
Maximum prompt length | 500 characters |
Number of free requests per minute for users without a billing account. Available only in the management console | 2 |
Number of free requests per day for users without a billing account. Available only in the management console | 10 |