Yandex Cloud
Search
Contact UsGet started
  • Pricing
  • Customer Stories
  • Documentation
  • Blog
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • AI for business
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Start testing with double trial credits
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Center for Technologies and Society
    • Yandex Cloud Partner program
  • Pricing
  • Customer Stories
  • Documentation
  • Blog
© 2025 Direct Cursus Technology L.L.C.
Yandex AI Studio
    • About Yandex AI Studio
      • Overview
      • Common instance models
      • Dedicated instance models
      • Batch processing
      • Function calling
      • Reasoning mode
      • Formatting model responses
      • Embeddings
      • Datasets
      • Fine-tuning
      • Tokens
    • Yandex Workflows
    • Quotas and limits
    • Terms and definitions
  • Compatibility with OpenAI
  • Access management
  • Pricing policy
  • Audit Trails events
  • Public materials
  • Release notes

In this article:

  • reasoning_option parameter
  • reasoning_effort parameter
  1. Concepts
  2. Model Gallery
  3. Reasoning mode

Reasoning mode in generative models

Written by
Yandex Cloud
Updated at October 24, 2025
  • reasoning_option parameter
  • reasoning_effort parameter

Generative models do not always cope equally well with tasks that require reasoning, i.e., breaking the task into steps and performing a chain of successive computations, where the results of the previous computation provide the input data for the next one.

You can improve the accuracy of the model's responses by forcing the model to reason and generate based on such chains of intermediate computations. You can do this using a prompt or a special generation parameter.

reasoning_option parameterreasoning_option parameter

You can configure the reasoning mode using the reasoning_options parameter when you access the models supporting this parameter through the API or SDK. The reasoning_options parameter can take the following values:

  • DISABLED: Reasoning mode is disabled. This is a default value. If the reasoning_options parameter is not specified in the request, the reasoning mode is disabled.
  • ENABLED_HIDDEN: Reasoning mode is enabled. Different models decide differently whether to use this mode for each particular request. Even if the model uses reasoning when generating a response, the response will not contain the model's actual chain of reasoning.

Example of a request configuration in the reasoning mode:

SDK
API
model = sdk.models.completions('yandexgpt')
modelRequest = model.configure(
        reasoning_mode='enabled_hidden',
    ).run("Request text")
{
  "modelUri": "gpt://<folder_ID>/yandexgpt",
  "completionOptions": {
    "stream": false,
    "temperature": 0.1,
    "maxTokens": "1000",
    "reasoningOptions": {
      "mode": "ENABLED_HIDDEN"
    }
  },
  "messages": [...]
}

The reasoning mode may increase the amount of computations and the total number of resulting request tokens: if reasoning was used, the model's response will contain the reasoningTokens field with a non-zero value.

The reasoning mode is available in the YandexGPT Pro model via the reasoning_options parameter.

reasoning_effort parameterreasoning_effort parameter

The reasoning_effort parameter determines how many reasoning tokens the model should generate before verbalizing a response to a query.

Supported values:

  • low: Prioritizes speed and saves tokens.
  • medium: Balance between reasoning speed and accuracy.
  • high: Prioritizes complete and thorough reasoning.

Example of using the reasoning_effort parameter:

Python
# Install OpenAI SDK using pip
# pip install openai 
import openai
from openai import OpenAI

YANDEX_CLOUD_FOLDER = "<folder_ID>"
YANDEX_CLOUD_API_KEY = "<API_key_value>"

def run():
    client = OpenAI(
        api_key=YANDEX_CLOUD_API_KEY,
        base_url="https://llm.api.cloud.yandex.net/v1",
        project=YANDEX_CLOUD_FOLDER
    )

    response = client.chat.completions.create(
        model=f"gpt://{YANDEX_CLOUD_FOLDER}/gpt-oss-120b",
        # or
        # model=f"gpt://{YANDEX_CLOUD_FOLDER}/gpt-oss-20b",
        messages=[
            {
                "role": "developer",
                "content": "You are a very smart assistant."},
            {
                "role": "user",
                "content": "What is under the hood of LLM?",
            },
        ],
        reasoning_effort="low",
    )

    print(response.choices[0].message.content)

if __name__ == "__main__":
    run()

Was the article helpful?

Previous
Function calling
Next
Formatting model responses
© 2025 Direct Cursus Technology L.L.C.