Yandex Cloud
Search
Contact UsGet started
  • Blog
  • Pricing
  • Documentation
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • ML & AI
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Customer Stories
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Education and Science
    • Yandex Cloud Partner program
  • Blog
  • Pricing
  • Documentation
© 2025 Direct Cursus Technology L.L.C.
Yandex Object Storage
    • All tutorials
    • Getting statistics on object queries with S3 Select
    • Getting website traffic statistics with S3 Select
    • Getting statistics on object queries using Yandex Query
    • Generating a resource-by-resource cost breakdown report using S3 Select
    • Server-side encryption
    • Integrating an L7 load balancer with CDN and Object Storage
    • Blue-green and canary deployment of service versions
    • Analyzing logs in DataLens
    • Mounting buckets to the file system of Yandex Data Processing hosts
    • Using Object Storage in Yandex Data Processing
    • Importing data from Object Storage, processing and exporting to Managed Service for ClickHouse®
    • Mounting a bucket as a disk in Windows
    • Migrating data from Yandex Data Streams using Yandex Data Transfer
    • Using hybrid storage in Yandex Managed Service for ClickHouse®
    • Loading data from Yandex Managed Service for OpenSearch to Yandex Object Storage using Yandex Data Transfer
    • Automatically copying objects from one bucket to another
    • Recognizing audio files in a bucket on a regular basis
    • Training a model in Yandex DataSphere on data from Object Storage
    • Connecting to Object Storage from VPC
    • Migrating data to Yandex Managed Service for PostgreSQL using Yandex Data Transfer
    • Uploading data to Yandex Managed Service for Greenplum® using Yandex Data Transfer
    • Uploading data to Yandex Managed Service for ClickHouse® using Yandex Data Transfer
    • Uploading data to Yandex Managed Service for YDB using Yandex Data Transfer
    • Exchanging data between Yandex Managed Service for ClickHouse® and Yandex Data Processing
    • Uploading data from Yandex Managed Service for YDB using Yandex Data Transfer
    • Hosting a static Gatsby website in Object Storage
    • Migrating a database from Managed Service for PostgreSQL to Object Storage
    • Exchanging data between Yandex Managed Service for ClickHouse® and Yandex Data Processing
    • Importing data from Yandex Managed Service for PostgreSQL to Yandex Data Processing using Sqoop
    • Importing data from Yandex Managed Service for MySQL® to Yandex Data Processing using Sqoop
    • Migrating data from Yandex Object Storage to Yandex Managed Service for MySQL® using Yandex Data Transfer
    • Migrating a database from Yandex Managed Service for MySQL® to Yandex Object Storage
    • Exporting Greenplum® data to a cold storage in Yandex Object Storage
    • Loading data from Yandex Direct to a Yandex Managed Service for ClickHouse® data mart using Yandex Cloud Functions, Yandex Object Storage, and Yandex Data Transfer
    • Migrating data from Elasticsearch to Yandex Managed Service for OpenSearch
    • Uploading Terraform states to Object Storage
    • Locking Terraform states using Managed Service for YDB
    • Visualizing Yandex Query data
    • Publishing game updates
    • VM backups using Hystax Acura
    • Backing up to Object Storage with CloudBerry Desktop Backup
    • Backing up to Object Storage with Duplicati
    • Backing up to Object Storage with Bacula
    • Backing up to Yandex Object Storage with Veeam Backup
    • Backing up to Object Storage with Veritas Backup Exec
    • Managed Service for Kubernetes cluster backups in Object Storage
    • Developing a custom integration in API Gateway
    • URL shortener
    • Storing application runtime logs
    • Developing a skill for Alice and a website with authorization
    • Creating an interactive serverless application using WebSocket
    • Deploying a web application using the Java Servlet API
    • Developing a Telegram bot
    • Replicating logs to Object Storage using Fluent Bit
    • Replicating logs to Object Storage using Data Streams
    • Uploading audit logs to ArcSight SIEM
    • Exporting audit logs to SIEM Splunk systems
    • Creating an MLFlow server for logging experiments and artifacts
    • Operations with data using Yandex Query
    • Federated data queries using Query
    • Recognizing text in image archives using Vision OCR
    • Converting a video to a GIF in Python
    • Automating tasks using Managed Service for Apache Airflow™
    • Processing files with usage details in Yandex Cloud Billing
    • Deploying a web app with JWT authorization in API Gateway and authentication in Firebase
    • Searching for Yandex Cloud events in Yandex Query
    • Searching for Yandex Cloud events in Object Storage
    • Creating an external table from a bucket table using a configuration file
    • Migrating databases from Google BigQuery to Managed Service for ClickHouse®
  • Terraform reference
  • Monitoring metrics
  • Audit Trails events
  • Bucket logs
  • Release notes
  • FAQ

In this article:

  • Getting started
  • Create a cloud function
  • Create a trigger
  • Test the function
  1. Tutorials
  2. Recognizing audio files in a bucket on a regular basis

Regular asynchronous recognition of audio files in Object Storage buckets

Written by
Yandex Cloud
Updated at May 7, 2025
  • Getting started
  • Create a cloud function
  • Create a trigger
  • Test the function

The SpeechKit asynchronous recognition API is integrated with Yandex Object Storage. Therefore, you can set up automatic recognition of audio files of supported formats that are regularly uploaded to an Object Storage bucket. A cloud function in Yandex Cloud Functions regularly checks the bucket for audio files and sends them to the SpeechKit API for recognition. The recognition result and status are saved to the same Object Storage bucket.

To set up automatic recognition of audio files using SpeechKit:

  1. Create a cloud function to read files from your Object Storage bucket, send them to the API, and check the file recognition status.
  2. Create a trigger to regularly invoke your cloud function.
  3. Test the function.

Getting startedGetting started

  1. Create a service account named asr-batch-sa.

  2. Assign the storage.editor, functions.functionInvoker, and ai.speechkit-stt.user roles to the service account for the folder in which it was created.

  3. Create a static access key for the service account.

  4. Create an API key to access the service account.

  5. Create an Object Storage bucket named asr-batch-bucket in the service account folder.

  6. Open asr-batch-bucket, click Create folder, and specify input in the Folder name field.

  7. Upload the config.json file with the specified recognition language to the bucket's input folder. The file only contains one setting:

    {
      "lang": "<language_code>"
    }
    

    Note

    If there is no config.json file in the bucket, the recognition language will be Russian.

Create a cloud functionCreate a cloud function

  1. In the management console, navigate to the folder with the new service account.

  2. Select Cloud Functions.

  3. Click Create function and specify asr-batch-function as the function name.

  4. Click Create.

  5. Under Editor, select the Python 3.8 runtime environment and click Continue.

  6. Download a script file from the Yandex Cloud repository.

  7. Under Function code, clear the contents of the index.py file and paste the downloaded script.

  8. Under Function code, create a file named requirements.txt and add the following code to it:

    boto3
    botocore
    requests
    
  9. Specify the function run settings:

    • Entry point: index.handler
    • Timeout: 60
    • Service account: asr-batch-sa
  10. Add these environment variables:

    • S3_BUCKET: asr-batch-bucket
    • S3_PREFIX: input
    • S3_PREFIX_LOG: log
    • S3_PREFIX_OUT: out
    • S3_KEY: Static access key ID
    • S3_SECRET: Static access key secret
    • API_KEY: API key ID
    • API_SECRET: API key secret
  11. Click Save changes.

Create a triggerCreate a trigger

  1. In the management console, select Cloud Functions.
  2. Select Triggers.
  3. Click Create trigger.
  4. Specify the trigger settings:
    • Name: asr-batch-cron.
    • Type: Timer.
    • Launched resource: Function.
    • Cron expression: Every minute.
    • Function: asr-batch-function.
    • Function version tag: $latest.
    • Service account: asr-batch-sa.
  5. Click Create trigger.

The trigger you created will fire once a minute and invoke the cloud function.

Test the functionTest the function

  1. In the management console, select Object Storage and open asr-batch-bucket.
  2. Upload audio files of any supported format to the input folder.
  3. Wait a few minutes and make sure the bucket now contains the log and out folders.
  4. Check the recognition status in the log folder. The status of each audio file sent for recognition is saved to an auxiliary file named <audio_file_name>.json (e.g., audio.mp3.json). The "done": "false" parameter in the file indicates the recognition process is not completed.
  5. Check the recognition result in the out folder. The result is saved to a JSON file named <audio_file_name>.json (e.g., audio.mp3.json). To learn more about the recognition result format, see Asynchronous recognition API.

Note

You can monitor the progress of the script in the logs of asr-batch-function.

See alsoSee also

  • Asynchronous recognition API v2
  • Asynchronous recognition of LPCM audio files using the API v2
  • Asynchronous recognition of OggOpus audio files using the API v2

Was the article helpful?

Previous
Automatically copying objects from one bucket to another
Next
Training a model in Yandex DataSphere on data from Object Storage
© 2025 Direct Cursus Technology L.L.C.