Retrieval
Note
We do not recommend using AI Assistant API in new projects. To create AI agents, use the Responses API.
Retrieval enables the AI assistant to search for information for the response in your own files (knowledge base). Retrieval comes with the specially trained paraphrase model, which rephrases users' queries to improve search quality.
To enable your AI assistant to use your knowledge base:
- Upload the knowledge base files using the API or ML SDK.
- Create a search index for your files. After that, you can delete the files you uploaded if you do not need citations.
- Connect the search index to your assistant. You can enable citations if the search index files are not deleted.
- Optionally, configure a strategy for using search indexes so that the assistant would access them only when required.
- Optionally, enable the paraphrase model as an additional layer in your assistant.
By default, the Retrieval tool accesses the index on each user request to the assistant. The tool finds and returns relevant extracts from source files, and the model uses this information to generate a response.
AI assistants do not always need to use a search index to respond to a user’s query: the general information available to the model is often enough to answer most questions. You can set up your assistant to use a Retrieval access strategy for its model itself to decide when to use the index to search for information.
For your AI assistant to be able to use Retrieval based on an access strategy:
In ML SDK, provide a search index access instruction for the model in the call_strategy parameter when creating Retrieval. Then, when creating the AI assistant, provide the resulting object with Retrieval in the tools parameter.
...
tool = sdk.tools.search_index(
search_index,
call_strategy={
"type": "function",
"function": {"name": "search-function-name", "instruction": "<search_usage_instructions>"},
},
)
assistant = sdk.assistants.create(
"yandexgpt",
instruction = "You are an internal corporate documentation assistant. Answer politely. If the information is not in the documents below, don't make up your answer.",
tools=[tool])
thread = sdk.threads.create()
...
In the API, when creating or updating an AI assistant, provide a search index access instruction for the model in the tools array of the request body in the callStrategy object.
{
...
"tools": [
{
"searchIndex": {
"searchIndexIds": [
"<search_index_ID>"
],
"maxNumResults": "<maximum_number_of_returned_fragments>",
"callStrategy": {
"autoCall": {
"instruction": "<search_usage_instruction>"
}
}
}
}
]
}
Where:
searchIndexIds: Array with IDs of search indexes the assistant will use. Currently, you can specify only one index.maxNumResults: Maximum number of results a search can return.instruction: Search usage instructions with guidelines for the assistant on when it should access the search index.
Search usage instructions which you provide in a strategy is essentially a prompt telling the assistant when it should access the search index. For example:
"instruction": "Search through the knowledge base only if the user has specifically asked you to do so."