AI Assistant API
The AI Assistant API feature is at the Preview stage.
AI Assistant API is a AI Studio tool for creating AI assistants. It can be used to create personalized assistants, implement a generative response scenario with access to information from external sources (known as retrieval augmented generation, or RAG
You can create your AI assistant using the Yandex Cloud ML SDK or through API requests in a programming language.
To use AI Assistant API in Yandex AI Studio, you need the ai.assistants.editor
and ai.languageModels.user
roles or higher for the folder.
Assistant components
AI Assistant API offers a number of abstractions for building a custom chatbot or AI assistant.
Assistant determines which model to use and what parameters and instructions to apply. This enables you to configure the model just once and use those settings in the future without needing to provide them every time.
Threads are used to maintain the historical context of user communication. Each user chat makes an individual thread. By running your assistant for a specific thread, you call the model and provide it with all the context stored in the thread. Listen the current run for intermediate generation results; the final response, once generated, will become part of the thread.
Tip
By default, each time the model starts running, it will reprocess the content of the thread. If a thread holds some large context and you start the assistant after each user message, running it can grow rather expensive. To optimize costs, consider limiting the size of the context to provide: set the customPromptTruncationOptions
parameter when starting your assistant.
For detailed costs of running an assistant, see Assistant pricing policy.
A simple assistant may be of help in automating routine operations over various data. For example, if you have tailored an effective prompt and settings for a model, you can create an assistant and run it for different threads. However, your assistants can do much more if equipped with a number of supplementary tools.
With AI Assistant API, your assistant can access RAG tools to retrieve information from external sources or function calling tools to invoke additional handlers and third-party APIs.
Note
The retention period for assistants, threads, search indexes, and users is limited. You can set this limit when creating an object using the ExpirationConfig
parameter. By default, a file not used for seven days is deleted.
Once an object is created, you cannot change its retention period or policy.
Use cases
- Creating a simple assistant
- Creating a RAG assistant with the VectorStore tool
- Creating a RAG assistant with intermediate response generation results
- Creating an AI assistant with RAG from PDF files with complex formatting
- Creating an AI assistant for RAG with source file and index metadata preserved
- Creating an assistant for RAG with query rephrasing