Creating a RAG assistant with intermediate response generation results
The AI Assistant API feature is at the Preview stage.
AI Assistant API is a AI Studio tool for creating AI assistants. It can be used to create personalized assistants, implement a generative response scenario with access to information from external sources (known as retrieval augmented generation, or RAG), and save the model's request context. When making requests to the assistant, you can get intermediate generation results as the model is generating a response.
Getting started
To use the examples:
-
Create a service account and assign the
ai.assistants.editorand
ai.languageModels.userroles to it.
-
Get the service account API key and save it.
The following examples use API key authentication. Yandex Cloud ML SDK also supports IAM token and OAuth token authentication. For more information, see Authentication in Yandex Cloud ML SDK.
-
Use the pip package manager to install the ML SDK library:
pip install yandex-cloud-ml-sdk
Create an assistant
This example shows how to create an assistant to store your conversations with YandexGPT Pro RC. When you access the model, your assistant will output intermediate generation results as the model is generating a response.
-
Create a file named
simple-assistant.pyand paste the following code into it:
#!/usr/bin/env python3 from __future__ import annotations from yandex_cloud_ml_sdk import YCloudML def main() -> None: sdk = YCloudML( folder_id="<folder_ID>", auth="<API_key>", ) sdk.setup_default_logging() assistant = sdk.assistants.create( "yandexgpt", temperature=0.5, max_prompt_tokens=50, ttl_days=6, expiration_policy="static", ) thread = sdk.threads.create( name="foo", ttl_days=6, expiration_policy="static", ) # Basic cycle of user interaction with the assistant input_text = input('Ask a question to the assistant (or "exit" to exit): ') while input_text.lower() != "exit": thread.write(input_text) # This way you can give the whole thread contents to the model run = assistant.run_stream(thread) # This way you can see the intermediate results as the model generates a response for event in run: print(event._message.parts) # This way you can see all fields of the final result print(f"run {event=}") input_text = input('Ask the assistant your question (or "exit" to exit): ') # Displaying the entire history of messages in the chat room print("Outputting the whole message history when exiting the chat:") for message in thread: print(f" {message=}") print(f" {message.text=}\n") # Deleting everything you do not need anymore for assistant in sdk.assistants.list(): assistant.delete() thread.delete() if __name__ == "__main__": main()
Where:
-
<folder_ID>: ID of the folder in which the service account was created.
-
<API_key>: Service account API key you got earlier required for authentication in the API.
-
-
Run the file you created:
python3 simple-assistant.py
The example implements the simplest chat possible: enter your requests to the assistant from your keyboard and get answers. To end the dialog, enter
exit.
Approximate result
Ask the assistant a question (or "exit" to exit): Hi! ('Hello',) ('Hello! What can I do for you?',) ('Hello! What can I do for you?',) run event=RunStreamEvent(status=<StreamEvent.DONE: 3>, error=None, _message=Message (id='fvtbkt2tbf7a********', parts=('Hello! What can I do for you?',), thread_id='fvt50ma5302n********', created_by='ajegtlf2q28a********', created_at=datetime.datetime(2025, 3, 13, 17, 51, 22, 146833), labels=None, author=Author(id='fvtle31p6lv2********', role='ASSISTANT'), citations=())) Ask your question to the assistant (or "exit" to exit): How many planets are there in the Solar System? ('In',) ('In the Solar System, there are **eight planets**: Mercury, Venus, Earth, Mars, Jupiter, Saturn',) ('In the Solar System, there are **eight planets**: Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, and Neptune.',) ('In the Solar System, there are **eight planets**: Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, and Neptune.',) run event=RunStreamEvent(status=<StreamEvent.DONE: 3>, error=None, _message=Message (id='fvt4f3p6ddue********', parts=('In the Solar System, there are **eight planets**: Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, and Neptune.',), thread_id='fvt50ma5302n********', created_by='ajegtlf2q28a********', created_at=datetime.datetime(2025, 3, 13, 17, 51, 33, 36643), labels=None, author=Author(id='fvtle31p6lv2********', role='ASSISTANT'), citations=())) Ask your question to the assistant (or "exit" to exit): How far is it from the Sun to the Earth? ('The average',) ('The average distance from the Earth to the Sun is approximately 149.6 million kilometers. This distance',) ('The average distance from the Earth to the Sun is approximately 149.6 million kilometers. This distance is also known as the astronomical unit (AU).',) ('The average distance from the Earth to the Sun is approximately 149.6 million kilometers. This distance is also known as the astronomical unit (AU).',) run event=RunStreamEvent(status=<StreamEvent.DONE: 3>, error=None, _message=Message (id='fvtees4295mr********', parts=('The average distance from the Earth to the Sun is approximately 149. 6 million kilometers. This distance is also known as the astronomical unit (AU).',), thread_id='fvt50ma5302n********', created_by='ajegtlf2q28a********', created_at=datetime.datetime(2025, 3, 13, 17, 51, 44, 33797), labels=None, author=Author(id='fvtle31p6lv2********', role='ASSISTANT'), citations=())) Ask your question to the assistant (or "exit" to exit): Exit Outputting the whole message history when exiting the chat: message=Message(id='fvtees4295mr********', parts=('The average distance from the Earth to the Sun is approximately 149.6 million kilometers. This distance is also known as the astronomical unit (AU).',), thread_id='fvt50ma5302n********', created_by='ajegtlf2q28a********', created_at=datetime. datetime(2025, 3, 13, 17, 51, 44, 33798), labels=None, author=Author(id='fvtle31p6lv2********', role='ASSISTANT'), citations=()) message.text='The average distance from the Earth to the Sun is approximately 149.6 million kilometers. This distance is also known as the astronomical unit (AU).' message=Message(id='fvto6b4rdg0o********', parts=('How far is it from the Sun to the Earth?',), thread_id='fvt50ma5302n********', created_by='ajegtlf2q28a********', created_at=datetime.datetime (2025, 3, 13, 17, 51, 42, 941742), labels=None, author=Author(id='fvtjnthkl0g5********', role='USER'), citations=()) message.text='How far is it from the Sun to the Earth?' message=Message(id='fvt4f3p6ddue********', parts=('In the Solar System, there are **eight planets**: Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, and Neptune.',), thread_id='fvt50ma5302n********', created_by='ajegtlf2q28a********', created_at=datetime.datetime(2025, 3, 13, 17, 51, 33, 36644), labels=None, author=Author(id='fvtle31p6lv2********', role='ASSISTANT'), citations=()) message.text='In the Solar System, there are **eight planets**: Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, and Neptune.' message=Message(id='fvtme86dsuju********', parts=('How many planets are there in the Solar System?' thread_id='fvt50ma5302n********', created_by='ajegtlf2q28a********', created_at=datetime.datetime (2025, 3, 13, 17, 51, 32, 139), labels=None, author=Author(id='fvtjnthkl0g5********', role='USER'), citations=()) message.text='How many planets are there in the Solar System?' message=Message(id='fvtbkt2tbf7a********', parts=('Hello! What can I do for you?',), thread_id='fvt50ma5302n********', created_by='ajegtlf2q28a********', created_at=datetime.datetime (2025, 3, 13, 17, 51, 22, 146834), labels=None, author=Author(id='fvtle31p6lv2********', role='ASSISTANT'), citations=()) message.text='Hello! What can I do for you?' message=Message(id='fvtemh1qqc50********', parts=('Hi!',), thread_id='fvt50ma5302n********', created_by='ajegtlf2q28a********', created_at=datetime.datetime(2025, 3, 13, 17, 51, 21, 359885), labels=None, author=Author(id='fvtjnthkl0g5********', role='USER'), citations=()) message.text='Hi!'