Yandex Cloud
Search
Contact UsTry it for free
  • Customer Stories
  • Documentation
  • Blog
  • All Services
  • System Status
  • Marketplace
    • Featured
    • Infrastructure & Network
    • Data Platform
    • AI for business
    • Security
    • DevOps tools
    • Serverless
    • Monitoring & Resources
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Start testing with double trial credits
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Center for Technologies and Society
    • Yandex Cloud Partner program
    • Price calculator
    • Pricing plans
  • Customer Stories
  • Documentation
  • Blog
© 2026 Direct Cursus Technology L.L.C.
Yandex Cloud Functions
  • Comparing with other Yandex Cloud services
    • All tutorials
      • Developing a Slack bot
      • Developing a Telegram bot
      • Developing a Telegram bot for text and audio recognition
  • Tools
  • Pricing policy
  • Access management
  • Terraform reference
  • Monitoring metrics
  • Audit Trails events
  • Public materials
  • Release notes
  • FAQ

In this article:

  • Getting started
  • Required paid resources
  • Set up the required resources
  • Register your Telegram bot
  • Create a function
  • Create an API gateway
  • Configure a link between the function and the Telegram bot
  • Test your bot
  • How to delete the resources you created
  1. Tutorials
  2. Serverless-based bots
  3. Developing a Telegram bot for text and audio recognition

Creating a Telegram bot for text recognition in images, speech synthesis, and audio recognition

Written by
Yandex Cloud
Updated at April 28, 2026
  • Getting started
    • Required paid resources
  • Set up the required resources
  • Register your Telegram bot
  • Create a function
  • Create an API gateway
  • Configure a link between the function and the Telegram bot
  • Test your bot
  • How to delete the resources you created

In this tutorial, you will learn how to create a Telegram bot that can:

  • Convert text messages to speech and transcribe voice messages using the Yandex SpeechKit Python SDK.
  • Recognize text in images with Yandex Vision OCR.

Authentication in Yandex Cloud services is performed using a service account with an IAM token. The IAM token resides in the handler function context, where the handler manages user interaction with the bot.

The Yandex API Gateway API gateway will accept requests from your bot and forward them to the Yandex Cloud Functions handler function for processing.

To create a bot:

  1. Get your cloud ready.
  2. Set up required resources.
  3. Register your Telegram bot.
  4. Create a function.
  5. Create an API gateway.
  6. Bind the handler function to the bot.
  7. Test the bot.

If you no longer need the resources you created, delete them.

Getting startedGetting started

Sign up for Yandex Cloud and create a billing account:

  1. Navigate to the management console and log in to Yandex Cloud or create a new account.
  2. On the Yandex Cloud Billing page, make sure you have a billing account linked and it has the ACTIVE or TRIAL_ACTIVE status. If you do not have a billing account, create one and link a cloud to it.

If you have an active billing account, you can create or select a folder for your infrastructure on the cloud page.

Learn more about clouds and folders here.

Required paid resourcesRequired paid resources

The cost of Telegram bot support includes:

  • Fee for using SpeechKit (see SpeechKit pricing).
  • Fee for using Vision OCR (see Vision OCR pricing).
  • Fees based on the number of function calls, computing resources allocated for function execution, and outbound traffic (see Cloud Functions pricing).
  • Fees based on the API gateway request count and outbound traffic (see API Gateway pricing).

Set up the required resourcesSet up the required resources

  1. Create a service account named recognizer-bot-sa and assign it the ai.editor and functions.editor roles for your folder.

  2. Download the FFmpeg package archive to ensure the SpeechKit Python SDK works correctly in the function runtime environment.

  3. Extract the ffmpeg and ffprobe binary files from the archive and make them executable by running the following commands:

    chmod +x ffmpeg
    chmod +x ffprobe
    
  4. Create a ZIP archive containing the function code:

    1. Create a file named index.py and paste the following code into it.

      index.py
      import logging
      import requests
      import telebot
      import json
      import os
      import base64
      from speechkit import model_repository, configure_credentials, creds
      from speechkit.stt import AudioProcessingType
      
      
      folder_id = ""
      iam_token = ''
      
      # Image recognition service endpoint and authentication data
      
      API_TOKEN = os.environ['TELEGRAM_TOKEN']
      vision_url = 'https://ocr.api.cloud.yandex.net/ocr/v1/recognizeText'
      
      # Adding the ffmpeg directory to the system PATH
      
      path = os.environ.get("PATH")
      os.environ["PATH"] = path + ':/function/code'
      
      logger = telebot.logger
      telebot.logger.setLevel(logging.INFO)
      bot = telebot.TeleBot(API_TOKEN, threaded=False)
      
      # Getting the folder ID
      
      def get_folder_id(iam_token, version_id):
          headers = {'Authorization': f'Bearer {iam_token}'}
          function_id_req = requests.get(f'https://serverless-functions.api.cloud.yandex.net/functions/v1/versions/{version_id}',
                                         headers=headers)
          function_id_data = function_id_req.json()
          function_id = function_id_data['functionId']
          folder_id_req = requests.get(f'https://serverless-functions.api.cloud.yandex.net/functions/v1/functions/{function_id}',
                                       headers=headers)
          folder_id_data = folder_id_req.json()
          folder_id = folder_id_data['folderId']
          return folder_id
      
      def process_event(event):
          request_body_dict = json.loads(event['body'])
          update = telebot.types.Update.de_json(request_body_dict)
      
          bot.process_new_updates([update])
      
      def handler(event, context):
          global iam_token, folder_id
          iam_token = context.token["access_token"]
          version_id = context.function_version
          folder_id = get_folder_id(iam_token, version_id)
      
          # Authenticating in SpeechKit with an IAM token
          configure_credentials(
              yandex_credentials=creds.YandexCredentials(
                  iam_token=iam_token
              )
          )
      
          process_event(event)
          return {
              'statusCode': 200
          }
      
      # Command and message handlers
      
      @bot.message_handler(commands=['help', 'start'])
      def send_welcome(message):
          bot.reply_to(message,
                       "The bot can do the following:\n* Recognize text in images.\n* Generate voice messages from text.\n* Convert voice messages to text.")
      
      @bot.message_handler(func=lambda message: True, content_types=['text'])
      def echo_message(message):
          export_path = '/tmp/audio.ogg'
          synthesize(message.text, export_path)
          with open(export_path, 'rb') as voice:
              bot.send_voice(message.chat.id, voice)
      
      @bot.message_handler(func=lambda message: True, content_types=['voice'])
      def echo_audio(message):
          file_id = message.voice.file_id
          file_info = bot.get_file(file_id)
          downloaded_file = bot.download_file(file_info.file_path)
          response_text = audio_analyze(downloaded_file)
          bot.reply_to(message, response_text)
      
      @bot.message_handler(func=lambda message: True, content_types=['photo'])
      def echo_photo(message):
          file_id = message.photo[-1].file_id
          file_info = bot.get_file(file_id)
          downloaded_file = bot.download_file(file_info.file_path)
          image_data = base64.b64encode(downloaded_file).decode('utf-8')
          response_text = image_analyze(vision_url, iam_token, folder_id, image_data)
          bot.reply_to(message, response_text)
      
      # Image recognition
      
      def image_analyze(vision_url, iam_token, folder_id, image_data):
          response = requests.post(vision_url, headers={'Authorization': 'Bearer '+iam_token, 'x-folder-id': folder_id}, json={
              "mimeType": "image",
              "languageCodes": ["en", "ru"],
              "model": "page",
              "content": image_data
              })
          blocks = response.json()['result']['textAnnotation']['blocks']
          text = ''
          for block in blocks:
              for line in block['lines']:
                  for word in line['words']:
                      text += word['text'] + ' '
                  text += '\n'
          return text
      
      # Speech recognition
      
      def audio_analyze(audio_data):
          model = model_repository.recognition_model()
      
          # Recognition settings
          model.model = 'general'
          model.language = 'ru-RU'
          model.audio_processing_type = AudioProcessingType.Full
      
          result = model.transcribe(audio_data)
          speech_text = [res.normalized_text for res in result]
          return ' '.join(speech_text)
      
      # Speech synthesis
      
      def synthesize(text, export_path):
          model = model_repository.synthesis_model()
      
          # Synthesis settings
          model.voice = 'kirill'
      
          result = model.synthesize(text, raw_format=False)
          result.export(export_path, 'ogg')
      
    2. Create a file named requirements.txt. In this file, specify the bot library and the Python SDK library:

      pyTelegramBotAPI==4.27
      yandex-speechkit==1.5.0
      
    3. Add the index.py, requirements.txt, ffmpeg, and ffprobe files to index.zip.

  5. Create an Object Storage bucket and upload your ZIP archive to it.

Register your Telegram botRegister your Telegram bot

Register your bot in Telegram and get its token.

  1. Launch BotFather and send it the following command:

    /newbot
    
  2. In the name field, specify the new bot’s name. This is the name users will see when chatting with the bot.

  3. In the username field, specify the new bot’s username. You can use it to find the bot in Telegram. The username must end with ...Bot or ..._bot.

    In the end, you will get a token. Save it, as you will need it later.

Create a functionCreate a function

Create a function that will handle user actions in the chat.

Management console
CLI
Terraform
API
  1. In the management console, select the folder where you want to create your function.

  2. Go to Cloud Functions.

  3. Create a function:

    1. Click Create function.
    2. Specify the function name: for-recognizer-bot.
    3. Click Create.
  4. Create a function version:

    1. Select Python as the runtime environment, disable Add files with code examples, and click Continue.

    2. Specify the upload method Object Storage and select the bucket you created earlier. In the Object field, specify the file name: index.zip.

    3. Specify the entry point: index.handler.

    4. Under Parameters, specify:

      • Timeout: 30.

      • Memory: 256 MB.

      • Service account: recognizer-bot-sa.

      • Environment variables:

        • TELEGRAM_TOKEN: Your Telegram bot token.
    5. Click Save changes.

If you do not have the Yandex Cloud CLI yet, install and initialize it.

The folder used by default is the one specified when creating the CLI profile. To change the default folder, use the yc config set folder-id <folder_ID> command. You can also set a different folder for any specific command using the --folder-name or --folder-id options.

  1. Create a function named for-recognizer-bot:

    yc serverless function create --name=for-recognizer-bot
    

    Result:

    id: b09bhaokchn9********
    folder_id: aoek49ghmknn********
    created_at: "2023-03-21T10:03:37.475Z"
    name: for-recognizer-bot
    log_group_id: eolm8aoq9vcp********
    http_invoke_url: https://functions.yandexcloud.net/b09bhaokchn9********
    status: ACTIVE
    
  2. Create a version of the for-recognizer-bot function:

    yc serverless function version create \
      --function-name for-recognizer-bot \
      --memory=256m \
      --execution-timeout=30s \
      --runtime=python312 \
      --entrypoint=index.handler \
      --service-account-id=<service_account_ID> \
      --environment TELEGRAM_TOKEN=<bot_token> \
      --package-bucket-name=<bucket_name> \
      --package-object-name=index.zip
    

    Where:

    • --function-name: Name of the function whose version you are creating.
    • --memory: Amount of RAM.
    • --execution-timeout: Maximum function runtime before timeout.
    • --runtime: Runtime environment.
    • --entrypoint: Entry point.
    • --service-account-id: recognizer-bot-sa service account ID.
    • --environment: Environment variables.
    • --package-bucket-name: Bucket name.
    • --package-object-name: File key in the index.zip bucket.

    Result:

    done (1s)
    id: d4e6qqlh53nu********
    function_id: d4emc80mnp5n********
    created_at: "2025-03-22T16:49:41.800Z"
    runtime: python312
    entrypoint: index.handler
    resources:
      memory: "268435456"
    execution_timeout: 30s
    service_account_id: aje20nhregkc********
    image_size: "4096"
    status: ACTIVE
    tags:
      - $latest
    log_group_id: ckgmc3l93cl0********
    environment:
      TELEGRAM_TOKEN: <bot_token>
    log_options:
      folder_id: b1g86q4m5vej********
    

With Terraform, you can quickly create a cloud infrastructure in Yandex Cloud and manage it using configuration files. These files store the infrastructure description written in HashiCorp Configuration Language (HCL). If you change the configuration files, Terraform automatically detects which part of your configuration is already deployed, and what should be added or removed.

Terraform is distributed under the Business Source License. The Yandex Cloud provider for Terraform is distributed under the MPL-2.0 license.

For more information about the provider resources, see the relevant documentation on the Terraform website or its mirror.

If you do not have Terraform yet, install it and configure the Yandex Cloud provider.

  1. Describe your function parameters in the configuration file:

    resource "yandex_function" "for-recognizer-bot-function" {
      name               = "for-recognizer-bot"
      user_hash          = "first function"
      runtime            = "python312"
      entrypoint         = "index.handler"
      memory             = "256"
      execution_timeout  = "30"
      service_account_id = "aje20nhregkcvu******"
      environment = {
        TELEGRAM_TOKEN = <bot_token>
      }
      package {
        bucket_name = <bucket_name>
        object_name = "index.zip"
      }
    }
    

    Where:

    • name: Function name.
    • user_hash: User-defined string that identifies the function version.
    • runtime: Function runtime environment.
    • entrypoint: Entry point.
    • memory: Amount of memory allocated for the function, in MB.
    • execution_timeout: Function runtime timeout.
    • service_account_id: recognizer-bot-sa service account ID.
    • environment: Environment variables.
    • package: Name of the bucket containing your previously uploaded index.zip archive with the function source code.

    For more information about yandex_function resource properties, see this provider guide.

  2. Validate your configuration files.

    1. In the terminal, navigate to the directory where you created your configuration file.

    2. Run a check using the following command:

      terraform plan
      

    If your configuration is correct, the terminal will display a list of the resources to be created and their settings. Otherwise, Terraform will show any detected errors.

  3. Deploy the cloud resources.

    1. If the configuration is correct, run this command:

      terraform apply
      
    2. To confirm the function creation, type yes in the terminal and press Enter.

To create a function, use the create REST API method for the Function resource or the FunctionService/Create gRPC API call.

To create a function version, use the createVersion REST API method for the Function resource or the FunctionService/CreateVersion gRPC API call.

Create an API gatewayCreate an API gateway

The Telegram server will notify your bot of new messages via a webhook. The API gateway will receive requests from the bot and forward them to the for-recognizer-bot function for processing.

Management console
CLI
Terraform
API
  1. In the management console, select the folder where you want to create an API gateway.

  2. Go to API Gateway.

  3. Click Create API gateway.

  4. In the Name field, specify recognizer-bot-api-gw.

  5. Under Specification, add the following specification:

    openapi: 3.0.0
    info:
      title: Sample API
      version: 1.0.0
    paths:
      /for-recognizer-bot-function:
        post:
          x-yc-apigateway-integration:
            type: cloud_functions
            function_id: <function_ID>
            service_account_id: <service_account_ID>
          operationId: for-recognizer-bot-function
    

    Where:

    • function_id: for-recognizer-bot function ID.
    • service_account_id: recognizer-bot-sa service account ID.
  6. Click Create.

  7. Select the previously created API gateway. Save the Default domain value, as you will need it later.

  1. Save the following specification to spec.yaml:

    openapi: 3.0.0
    info:
      title: Sample API
      version: 1.0.0
    paths:
      /for-recognizer-bot-function:
        post:
          x-yc-apigateway-integration:
            type: cloud_functions
            function_id: <function_ID>
            service_account_id: <service_account_ID>
          operationId: for-recognizer-bot-function
    

    Where:

    • function_id: for-recognizer-bot function ID.
    • service_account_id: recognizer-bot-sa service account ID.
  2. Run this command:

    yc serverless api-gateway create --name recognizer-bot-api-gw --spec=spec.yaml
    

    Where:

    • --name: API gateway name.
    • --spec: Specification file.

    Result:

    done (5s)
    id: d5d1ud9bli1e********
    folder_id: b1gc1t4cb638********
    created_at: "2023-09-25T16:01:48.926Z"
    name: recognizer-bot-api-gw
    status: ACTIVE
    domain: d5dm1lba80md********.i9******.apigw.yandexcloud.net
    log_group_id: ckgefpleo5eg********
    connectivity: {}
    log_options:
      folder_id: b1gc1t4cb638********
    

To create an API gateway:

  1. Specify the yandex_api_gateway resource parameters in the configuration file:

    resource "yandex_api_gateway" "recognizer-bot-api-gw" {
      name        = "recognizer-bot-api-gw"
      spec = <<-EOT
        openapi: 3.0.0
        info:
          title: Sample API
          version: 1.0.0
    
        paths:
          /for-recognizer-bot-function:
            post:
              x-yc-apigateway-integration:
                type: cloud_functions
                function_id: <function_ID>
                service_account_id: <service_account_ID>
              operationId: for-recognizer-bot-function
      EOT
    }
    

    Where:

    • name: API gateway name.
    • spec: API gateway specification.

    For more information about Terraform resource parameters, see this provider guide.

  2. Validate your configuration files.

    1. In the terminal, navigate to the directory where you created your configuration file.

    2. Run a check using the following command:

      terraform plan
      

    If your configuration is correct, the terminal will display a list of the resources to be created and their settings. Otherwise, Terraform will show any detected errors.

  3. Deploy the cloud resources.

    1. If the configuration is correct, run this command:

      terraform apply
      
    2. To confirm resource creation, type yes and press Enter.

To create an API gateway, use the create REST API method for the ApiGateway resource or the ApiGatewayService/Create gRPC API call.

Configure a link between the function and the Telegram botConfigure a link between the function and the Telegram bot

Set up a webhook for your Telegram bot:

curl --request POST \
  --url 'https://api.telegram.org/bot<bot_token>/setWebhook' \
  --header 'content-type: application/json' \
  --data '{"url": "<API_gateway_domain>/for-recognizer-bot-function"}'

Where:

  • <bot_token>: Telegram bot token.
  • <API_gateway_domain>: recognizer-bot-api-gw API gateway's service domain.

Result:

{"ok":true,"result":true,"description":"Webhook was set"}

Test your botTest your bot

Chat with the bot:

  1. Open Telegram and find the bot by its username.

  2. Send /start to the chat.

    The bot should respond with:

    The bot can do the following:
    
    * Recognize text from images.
    * Generate voice messages from text.
    * Convert voice messages to text.
    
  3. Send a text message to the chat. The bot will respond with a voice message generated from your text.

  4. Send a voice message to the chat. The bot will respond with a text message transcribed from your speech.

  5. Send an image containing text to the chat. The bot will respond with a message containing the transcribed text.

    Note

    The image must meet the following requirements.

How to delete the resources you createdHow to delete the resources you created

To avoid incurring charges for resources you no longer need, delete them:

  • Delete the API Gateway.
  • Delete the function in Cloud Functions.

Was the article helpful?

Previous
Developing a Telegram bot
Next
Writing device data to a database
© 2026 Direct Cursus Technology L.L.C.