Regular asynchronous recognition of audio files from Yandex Object Storage
The SpeechKit asynchronous recognition API is integrated with Yandex Object Storage. This enables you to set up automatic recognition of audio files of supported formats that are regularly uploaded to an Object Storage bucket. A cloud function in Yandex Cloud Functions regularly checks the bucket for audio files and sends them to the SpeechKit API for recognition. The recognition result and status are saved to the same Object Storage bucket.
To set up automatic audio file recognition using SpeechKit:
- Create a cloud function to read files from your Object Storage bucket, send them to the API, and check the file recognition status.
- Create a trigger to regularly invoke your cloud function.
- Test the function.
Getting started
-
Create a service account named
asr-batch-sa
. -
Assign the service account the
storage.editor
,functions.functionInvoker
, andai.speechkit-stt.user
roles for the folder in which it was created. -
Create a static access key for the service account.
-
Create an API key to access the service account.
-
Create an Object Storage bucket named
asr-batch-bucket
in the service account folder. -
Open
asr-batch-bucket
, click Create folder, and specifyinput
in the Folder name field. -
Upload the
config.json
file with the specified recognition language to the bucket'sinput
folder. The file only contains one parameter:{ "lang": "<language_code>" }
Note
If there is no
config.json
file in the bucket, the recognition language will be Russian.
Create a cloud function
-
In the management console
, navigate to the folder with the new service account. -
Select Cloud Functions.
-
Click Create function and specify
asr-batch-function
as the function name. -
Click Create.
-
Under Editor, select the
Python
3.8
runtime environment and click Continue. -
Download a script file
from the Yandex Cloud repository. -
Under Function code, clear the contents of the
index.py
file and paste the downloaded script. -
Under Function code, create a file named
requirements.txt
and add the following code to it:boto3 botocore requests
-
Specify the function run parameters:
- Entry point:
index.handler
- Timeout, sec:
60
- Service account:
asr-batch-sa
- Entry point:
-
Add these environment variables:
S3_BUCKET
:asr-batch-bucket
S3_PREFIX
:input
S3_PREFIX_LOG
:log
S3_PREFIX_OUT
:out
S3_KEY
: Static access key IDS3_SECRET
: Static access key secretAPI_KEY
: API key IDAPI_SECRET
: API key secret
-
Click Save changes.
Create a trigger
- In the management console, select Cloud Functions.
- Select Triggers.
- Click Create trigger.
- Specify the trigger parameters:
- Name:
asr-batch-cron
- Type:
Timer
- Launched resource:
Function
- Cron expression:
Every minute
- Function:
asr-batch-function
- Function version tag:
$latest
- Service account:
asr-batch-sa
- Name:
- Click Create trigger.
The trigger you created will fire once a minute and invoke the cloud function.
Test the function
- In the management console, select Object Storage and open
asr-batch-bucket
. - Upload audio files of any supported format to the
input
folder. - Wait a few minutes and make sure the bucket now contains the
log
andout
folders. - Check the recognition status in the
log
folder. The status of each audio file sent for recognition is saved to an auxiliary file named<audio_file_name>.json
, e.g.,audio.mp3.json
. The"done": "false"
parameter in the file indicates the recognition process is not completed. - Check the recognition result in the
out
folder. The result is saved to a JSON file namedaudio_file_name>.json
, e.g.,audio.mp3.json
. For more information about the recognition result format, see Asynchronous recognition API.
Note
You can monitor the progress of the script in the logs of asr-batch-function
.