Yandex AI Studio quotas and limits
Written by
Updated at November 27, 2025
Yandex AI Studio has the following limits:
- Quotas are organizational restrictions that can be changed by technical support on request.
- Limits are technical limitations due to Yandex Cloud architectural features. The limits cannot be changed.
If you need more resources, contact support
Quotas
| Type of limit | Value |
|---|---|
| Text vectorization | |
| Number of text vectorization requests per second | 10 |
| Text generation | |
| Number of concurrent generations in synchronous mode | 10 |
| Number of requests per second, asynchronous mode (request) | 10 |
| Number of requests per second, asynchronous mode (getting a response) | 50 |
| Number of requests per hour, asynchronous mode (request) | 5,000 |
| Number of tokenization requests per second | 50 |
| Dedicated instances | |
| Number of concurrent dedicated instances | 1 |
| Model operating mode: Batch | |
| Number of runs per hour | 10 |
| Number of runs per day | 100 |
| Text classification | |
| Number of text classification requests per second | 1 |
| Image generation | |
| Number of generation requests per minute | 500 |
| Number of generation requests per day | 5,000 |
| Number of result requests per second | 50 |
| Model tuning | |
| Number of fine-tuning runs per day | 10 |
| Number of fine-tuning runs per hour | 3 |
| Datasets | |
| Number of uploaded datasets | 100 |
| Maximum size of one dataset | 5 GB |
| Total size of datasets | 300 GB |
| MCP servers | |
| Number of MCP servers per cloud | 30 |
| Number of tools per server | 50 |
Voice agents (speech-realtime-250923 model) |
|
| Number of concurrent sessions with the model | 10 |
| Number of session creation queries per second | 10 |
Limits
| Type of limit | Value |
|---|---|
| Period to store results of asynchronous requests on the server | 3 days |
| Text vectorization | |
| Number of input tokens | 2,000 |
| Output vector size | 256 |
| Text generation | |
| Maximum number of tokens per response in the management console |
1,000 |
| Text classification | |
| Number of classes in prompt-based classifiers | 20 |
| Number of classes in fine-tuned classifiers | 100 |
| Assistants | |
| Maximum number of assistants | 1,000 |
| Maximum number of threads | 10,000 |
| Maximum number of users | 10,000 |
| Maximum number of files to upload | 10,000 |
| Maximum file size | 128 MB |
| Number of files per upload | 100 |
| Maximum number of files per search index | 10,000 |
| Maximum number of messages per thread | 100,000 |
| Maximum number of search indexes | 1,000 |
| Maximum number of indexing operations to run | 10 |
| Image generation | |
| Maximum prompt length | 500 characters |
| MCP servers | |
| Number of active cloud connections per availability zone | 500 |