Yandex Cloud
Search
Contact UsGet started
  • Pricing
  • Customer Stories
  • Documentation
  • Blog
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • AI for business
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Start testing with double trial credits
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Center for Technologies and Society
    • Yandex Cloud Partner program
  • Pricing
  • Customer Stories
  • Documentation
  • Blog
© 2025 Direct Cursus Technology L.L.C.
Yandex AI Studio
  • Getting started with Model Gallery
    • About Yandex AI Studio
    • Yandex Workflows
    • Quotas and limits
    • Terms and definitions
  • Switching from the AI Assistant API to Responses API
  • Compatibility with OpenAI
  • Access management
  • Pricing policy
  • Audit Trails events
  • Public materials
  • Release notes

In this article:

  • Configuring OpenAI to work with AI Studio
  • Use cases
  • Text generation
  • Function calling
  • Embeddings
  • Models

Compatibility with OpenAI

Written by
Yandex Cloud
Updated at October 24, 2025
  • Configuring OpenAI to work with AI Studio
  • Use cases
    • Text generation
    • Function calling
    • Embeddings
    • Models

The AI Studio API is compatible with the OpenAI API for full support for the Responses API, Realtime API, and Vector Store API. It is also partially compatible with the Completions API.

You can quickly adapt your applications designed to work with OpenAI by changing a few parameters in the query.

Use the Yandex Cloud ML SDK API and library to access all AI Studio features.

Configuring OpenAI to work with AI StudioConfiguring OpenAI to work with AI Studio

To use AI Studio's text generation models in OpenAI libraries, change the basic endpoint and specify the service account's API key and home folder ID.

Python
Node.js
import openai

client = openai.OpenAI(
    api_key="<API_key_value>",
    base_url="<API_endpoint>",
    project="<folder_ID>"
)
import OpenAI from "openai";

const openai = new OpenAI({
  apiKey:"<API_key_value>",
  project:"<folder_ID>",
  baseURL:"<API_endpoint>");

To use the Completions API, specify https://llm.api.cloud.yandex.net/v1.
For requests to the Responses API or Vector Store API, use https://rest-assistant.api.cloud.yandex.net/v1.
To create a voice agent and use the Realtime API via web sockets, specify wss://rest-assistant.api.cloud.yandex.net/v1/realtime/openai?model=gpt://<folder_ID>/speech-realtime-250923.

How to get an API key for AI Studio.

Use casesUse cases

Before sending the query, in the model URI, specify the ID of the folder you got the API key in.

For examples of using the Responses API and Realtime API, refer to our Step-by-step guides.

Text generationText generation

In OpenAI compatibility mode, the Completions API supports the following parameters: temperature, max_tokens, stream, and response_format.

Python
Node.js
cURL
  • Streaming response processing:

    # Install OpenAI SDK using pip
    # pip install openai 
    import openai
    
    YANDEX_CLOUD_FOLDER = "<folder_ID>"
    YANDEX_CLOUD_API_KEY = "<API_key_value>"
    
    client = openai.OpenAI(
        api_key=YANDEX_CLOUD_API_KEY,
        base_url="https://llm.api.cloud.yandex.net/v1",
        project=YANDEX_CLOUD_FOLDER
    )
    
    response = client.chat.completions.create(
        model=f"gpt://{YANDEX_CLOUD_FOLDER}/yandexgpt/latest",
        messages=[
            {"role": "system", "content": "You are a very smart assistant."},
            {"role": "user", "content": "What can large language models do?"}
        ],
        max_tokens=2000,
        temperature=0.3,
        stream=True
    )
    
    for chunk in response:
        if chunk.choices[0].delta.content is not None:
            print(chunk.choices[0].delta.content, end="")
    
  • Structured response:

    import openai
    
    YANDEX_CLOUD_FOLDER = "<folder_ID>"
    YANDEX_CLOUD_API_KEY = "<API_key_value>"
    
    client = openai.OpenAI(
        api_key=YANDEX_CLOUD_API_KEY,
        base_url="https://llm.api.cloud.yandex.net/v1",
        project=YANDEX_CLOUD_FOLDER
    )
    
    json_schema = {
        "type": "object",
        "properties": {
            "skyscraper_name": {"type": "string", "description": "Skyscraper name."},
            "skyscraper_height": {"type": "integer", "description": "Skyscraper height in meters."},
        },
        "required": ["skyscraper_name", "skyscraper_height"]
    }
    
    response = client.chat.completions.create(
        model=f"gpt://{YANDEX_CLOUD_FOLDER}/yandexgpt/rc",
        messages=[
            {"role": "user", "content": "Shanghai Tower (Shanghai, China): 632 meters, 127 floors."}
        ],
        max_tokens=200,
        temperature=0.3,
        stream=False,
        response_format={"type": "json_schema", "json_schema": json_schema}
    )
    print(response)
    
import OpenAI from "openai";

const openai = new OpenAI({
  apiKey:"<API_key_value>",
  project:"<folder_ID>",
  baseURL:"https://llm.api.cloud.yandex.net/v1"});

async function main() {
  const completion = await openai.chat.completions.create({
    messages: [{"role": "system", "content": "You are a very smart assistant."},
      {"role": "user", "content": "What can large language models do?"}],
    model: "gpt://<folder_ID>/yandexgpt/latest",
  });

console.log(completion.choices[0]);
}
main();
curl https://llm.api.cloud.yandex.net/v1/chat/completions \
  --header "Content-Type: application/json" \
  --header "Authorization: Bearer <API_key>" \
  --header "OpenAI-Project: <folder_ID>" \
  --data '{
    "model": "gpt://<folder_ID>/yandexgpt/latest",
    "messages": [
      {
        "role": "system",
        "content": "You are a very smart assistant."
      },
      {
        "role": "user",
        "content": "What can large language models do?"
      }
    ]
  }'

Function callingFunction calling

Before running the example, specify the folder ID and Yandex Cloud API key. The tool_choice parameter with auto and none values is supported.

Python
import openai
import json

YANDEX_CLOUD_FOLDER = "<folder_ID>"
YANDEX_CLOUD_API_KEY = "<API_key_value>"

client = openai.OpenAI(
    api_key=YANDEX_CLOUD_API_KEY,
    base_url="https://llm.api.cloud.yandex.net/v1",
    project=YANDEX_CLOUD_FOLDER
)

# Weather function
def get_current_weather(location):
    return {"location": location, "temperature": -22, "weather_condition": "Sunny"}

# Calculator function
def calculator(a, b):
    return a + b

def run_conversation(user_input):
    selected_model = f"gpt://{YANDEX_CLOUD_FOLDER}/yandexgpt/rc"

    # Defining functions
    tools = [
        {
            "type": "function",
            "function": {
                "name": "get_weather",
                "description": "Getting current weather for the specified location",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "Location"
                        }
                    },
                    "required": ["location"]
                }
            }
        },
        {
            "type": "function",
            "function": {
                "name": "calculator",
                "description": "Adding two numbers",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "a": {
                            "type": "int",
                            "description": "First number"
                        },
                        "b": {
                            "type": "int",
                            "description": "Second number"
                        }
                    },
                    "required": ["a", "b"]
                }
            }
        }
    ]

    # Running a query
    response = client.chat.completions.create(
        model=selected_model,
        messages=[
            {"role": "user", "content": user_input}
        ],
        tool_choice="auto",
        tools=tools
    )

    # Model response
    message = response.choices[0].message
    print(message)

    # Calling model-requested functions
    if message.tool_calls:
        # Array of messages to send execution results
        new_messages = [
            {"role": "user", "content": user_input},
            message
        ]

        # Populating the result for each function call
        for tool_call in message.tool_calls:

            function_name = tool_call.function.name
            function_args = json.loads(tool_call.function.arguments)

            if function_name == "get_weather":
                function_response = get_current_weather(function_args.get("get_current_weather"))
                new_messages.append({
                    "role": "tool",
                    "tool_call_id": tool_call.id,
                    "content": json.dumps(function_response)
                })

            if function_name == "calculator":
                function_response = calculator(function_args.get("a"), function_args.get("b"))
                new_messages.append({
                    "role": "tool",
                    "tool_call_id": tool_call.id,
                    "content": json.dumps(function_response)
                })

        second_response = client.chat.completions.create(
            model=selected_model,
            messages=new_messages,
            tools=tools
        )

        # Model response with information from function calls
        return second_response.choices[0].message.content

    # No functions were called, returning the original response
    return message.content


if __name__ == "__main__":
    result = run_conversation("2+2 and weather in moscow")
    print(result)

EmbeddingsEmbeddings

AI Studio supports embeddings for single strings with encoding_format set to float.

Python
import openai
import numpy as np
from scipy.spatial.distance import cdist

YANDEX_CLOUD_FOLDER = "<folder_ID>"
YANDEX_CLOUD_API_KEY = "<API_key_value>"

client = openai.OpenAI(
    api_key=YANDEX_CLOUD_API_KEY,
    base_url="https://llm.api.cloud.yandex.net/v1",
    project=YANDEX_CLOUD_FOLDER
)

# Method for getting a random embedding
def get_embedding(text, model):
    # Removing excessive line breaks
    fixed_text = get_trimmed_text(text)
    return (
        (
            client.embeddings.create(
                input=fixed_text,
                model=model,
                encoding_format="float",
            )
        )
        .data[0]
        .embedding
    )

# Method for getting document embeddings
def get_doc_embeddings(texts):
    doc_embeddings = []
    for text in texts:
        embedding = get_embedding(text, model=f"emb://{YANDEX_CLOUD_FOLDER}/text-search-doc/latest")
        doc_embeddings.append(embedding)
    return doc_embeddings

# Method for getting query embeddings
def get_query_embedding(text):
    embedding = get_embedding(text, model=f"emb://{YANDEX_CLOUD_FOLDER}/text-search-query/latest")
    return np.array(embedding)

# Helper method for removing line breaks
def get_trimmed_text(text):
    return ' '.join(text.split())

def main():
    # Document for search as an array of texts
    doc_texts = [
        """Alexander Sergeyevich Pushkin (May 26 [June 6], 1799, Moscow – January 29 [February 10], 1837, St. Petersburg)
        was a Russian poet, playwright, and novelist, the progenitor of Russian realism,
        a literary critic and theorist, historian, essay writer, and journalist.""",
        """Pushkin repeatedly wrote about his ancestry in poems and prose, seeing in his ancestors an example of true
        aristocracy, an ancient lineage that faithfully served the fatherland yet never gained rulers' favor and was
        persecuted. He often referred to, also through literary forms, the image of his maternal great-grandfather
        of African origin, Abraham Petrovich Gannibal, who became a servant and ward of Peter I, later a military engineer and
        general""",
    ]

    # Search query text
    query_text = "when is Pushkin's birthday?"

    # Getting document embeddings
    doc_embedding = get_doc_embeddings(doc_texts)
    # Getting a query embedding
    query_embedding = get_query_embedding(query_text)
    # Calculating cosine distance
    cosine_distance = cdist([query_embedding], doc_embedding, metric="cosine")
    # Calculating similarity
    cosine_similarity = 1 - cosine_distance
    # Calculating the index of the most relevant text
    argmax = np.argmax(cosine_similarity)
    # Getting text by index
    result = doc_texts[argmax]

    print(get_trimmed_text(result))


if __name__ == "__main__":
    main()

ModelsModels

AI Studio supports a method for getting a list of available models:

Python
import openai

YANDEX_CLOUD_FOLDER = "<folder_ID>"
YANDEX_CLOUD_API_KEY = "<API_key_value>"

client = openai.OpenAI(
    api_key=YANDEX_CLOUD_API_KEY,
    base_url="https://llm.api.cloud.yandex.net/v1",
    project=YANDEX_CLOUD_FOLDER
)
models = client.models.list()
print(models.data)

Was the article helpful?

Previous
Switching from the AI Assistant API to Responses API
Next
About Yandex Cloud ML SDK
© 2025 Direct Cursus Technology L.L.C.