Regular recognition of audio files from Yandex Object Storage
The SpeechKit asynchronous recognition API
To set up automatic recognition of audio files using SpeechKit:
- Create a cloud function to read files from your Object Storage bucket, send them to the API, and check the file recognition status.
- Create a trigger to regularly invoke your cloud function.
- Test the function.
Getting started
-
Create a service account named
asr-batch-sa. -
Assign the
storage.editor,functions.functionInvoker, andai.speechkit-stt.userroles to the service account for the folder in which it was created. -
Create a static access key for the service account.
-
Create an API key to access the service account.
-
Create an Object Storage bucket named
asr-batch-bucketin the service account folder. -
Open
asr-batch-bucket, click Create folder, and specifyinputin the Folder name field. -
Upload the
config.jsonfile with the specified recognition language to the bucket'sinputfolder. The file only contains one setting:{ "lang": "<language_code>" }Note
If there is no
config.jsonfile in the bucket, the recognition language will be Russian.
Create a cloud function
-
In the management console
, navigate to the folder with the new service account. -
Select Cloud Functions.
-
Click Create function and specify
asr-batch-functionas the function name. -
Click Create.
-
Under Editor, select the
Python3.8runtime environment and click Continue. -
Download a script file
from the Yandex Cloud repository. -
Under Function code, clear the contents of the
index.pyfile and paste the downloaded script. -
Under Function code, create a file named
requirements.txtand add the following code to it:boto3 botocore requests -
Specify the function run settings:
- Entry point:
index.handler - Timeout:
60 - Service account:
asr-batch-sa
- Entry point:
-
Add these environment variables:
S3_BUCKET:asr-batch-bucketS3_PREFIX:inputS3_PREFIX_LOG:logS3_PREFIX_OUT:outS3_KEY: Static access key IDS3_SECRET: Static access key secretAPI_KEY: API key IDAPI_SECRET: API key secret
-
Click Save changes.
Create a trigger
- In the management console, select Cloud Functions.
- Select Triggers.
- Click Create trigger.
- Specify the trigger settings:
- Name:
asr-batch-cron. - Type:
Timer. - Launched resource:
Function. - Cron expression:
Every minute. - Function:
asr-batch-function. - Function version tag:
$latest. - Service account:
asr-batch-sa.
- Name:
- Click Create trigger.
The trigger you created will fire once a minute and invoke the cloud function.
Test the function
- In the management console, select Object Storage and open
asr-batch-bucket. - Upload audio files of any supported format
to theinputfolder. - Wait a few minutes and make sure the bucket now contains the
logandoutfolders. - Check the recognition status in the
logfolder. The status of each audio file sent for recognition is saved to an auxiliary file named<audio_file_name>.json(e.g.,audio.mp3.json). The"done": "false"parameter in the file indicates the recognition process is not completed. - Check the recognition result in the
outfolder. The result is saved to a JSON file named<audio_file_name>.json(e.g.,audio.mp3.json). To learn more about the recognition result format, see Asynchronous recognition API .
Note
You can monitor the progress of the script in the logs of asr-batch-function.