Yandex Cloud
Search
Contact UsTry it for free
  • Customer Stories
  • Documentation
  • Blog
  • All Services
  • System Status
  • Marketplace
    • Featured
    • Infrastructure & Network
    • Data Platform
    • AI for business
    • Security
    • DevOps tools
    • Serverless
    • Monitoring & Resources
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Start testing with double trial credits
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Center for Technologies and Society
    • Yandex Cloud Partner program
    • Price calculator
    • Pricing plans
  • Customer Stories
  • Documentation
  • Blog
© 2026 Direct Cursus Technology L.L.C.
Yandex AI Studio
  • About Yandex AI Studio
  • Getting started with Model Gallery
  • Yandex Workflows
    • All guides
    • Disabling request logging
    • Getting an API key
      • Image generation
      • Multimodal models
      • Batch processing
  • Switching from the AI Assistant API to Responses API
  • Compatibility with OpenAI
  • Quotas and limits
  • Pricing policy
  • Access management
  • Audit Trails events
  • Public materials
  • Release notes
  • Terms and definitions
  1. Step-by-step guides
  2. Model Gallery
  3. Multimodal models

Sending a request to a multimodal model

Written by
Yandex Cloud
Updated at January 29, 2026

In AI Studio, you can send requests to multimodal models able to analyze images and respond in text form. Your images should be Base64-encoded.

Note

AI Playground does not support multimodal models yet.

To complete the steps from this example, create a service account with the ai.languageModels.user role and get an API key with the yc.ai.foundationModels.execute scope.

Python
from openai import OpenAI
import base64

YC_API_KEY = "<API_key>"
YC_FOLDER_ID = "<folder_ID>"

client = OpenAI(
    api_key=YC_API_KEY,
    base_url="https://ai.api.cloud.yandex.net/v1",
)


# Auxiliary method for converting images to Base64
def image_to_base64(image_path):
    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode('utf-8')


# Images for comparison
image1_base64 = image_to_base64("image1.png")
image2_base64 = image_to_base64("image2.png")

# In this example, we use Gemma 3 27B it
response = client.chat.completions.create(
    model=f"gpt://{YC_FOLDER_ID}/gemma-3-27b-it",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Compare these two images"
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": f"data:image/jpeg;base64,{image1_base64}"
                    }
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": f"data:image/jpeg;base64,{image2_base64}"
                    }
                }
            ]
        }
    ]
)

print(response.choices[0].message.content)

Where:

  • YC_API_KEY: Service account API key you obtained.
  • YC_FOLDER_ID: Service account folder ID.
Model response example
**First image:**

*   **Object:** Little penguin.
*   **Properties:** A fluffy little penguin with a cute face. It is holding a laptop.
*   **Background:** A white snow-covered lanscape.

**Second image:**

*   **Object:** A raccoon wrapped in a white bath towel.
*   **Properties:** The raccoon looks thoughtful, slightly saddened.
*   **Background:** Resembles a bathroom or another utility room.

**Main differences:**

*   **Animal species:** Penguins and raccoons are completely different species living in different habitats.
*   **Scenery:** One image is made outdoors (snow), the other one indoors.
*   **Actions:** The little penguin seems to be working or just looking at its laptop, while the raccoon seems to be resting after a bath.

On the whole, both the images are really cute and emotionally positive; however, they depict completely different scenes and animals.

Was the article helpful?

Previous
Image generation
Next
Batch processing
© 2026 Direct Cursus Technology L.L.C.