Using Yandex API Gateway to set up speech synthesis in Yandex SpeechKit
Using serverless technology, you can create your own integration with Yandex Cloud services.
This guide will show you how to create a custom setup with an OpenAPI 3.0
The users’ speech synthesis requests run through the API gateway that calls SpeechKit API through HTTP integration and retrieves the synthesized speech from SpeechKit.
To set up SpeechKit speech synthesis using Yandex API Gateway:
If you no longer need the resources you created, delete them.
Getting started
Sign up for Yandex Cloud and create a billing account:
- Go to the management console
and log in to Yandex Cloud or create an account if you do not have one yet. - On the Yandex Cloud Billing
page, make sure you have a billing account linked and it has theACTIVE
orTRIAL_ACTIVE
status. If you do not have a billing account, create one.
If you have an active billing account, you can go to the cloud page
Learn more about clouds and folders.
Required paid resources
The cost of support for the new infrastructure includes:
- Fee for the number of requests to the API gateway and outgoing traffic (see Yandex API Gateway pricing).
- Fee for using SpeechKit (see SpeechKit pricing).
Create a service account
Create a service account with the speechkit-sa
role for the folder where you create your infrastructure:
- In the management console
, select the folder where you want to create a service account. - Go to the Service accounts tab.
- Click Create service account.
- Enter the service account name:
speechkit-sa
. - Click
Add role and select theai.speechkit-tts.user
role. - Click Create.
If you do not have the Yandex Cloud command line interface yet, install and initialize it.
The folder specified in the CLI profile is used by default. You can specify a different folder using the --folder-name
or --folder-id
parameter.
-
Create a service account named
speechkit-sa
:yc iam service-account create speechkit-sa
Result:
id: nfersamh4sjq******** folder_id: b1gc1t4cb638******** created_at: "2023-09-21T10:36:29.726397755Z" name: speechkit-sa
Save the
id
of thespeechkit-sa
service account and the folder where it was created (folder_id
).For more information about the
yc iam service-account create
command, see the CLI reference. -
Assign the
ai.speechkit-tts.user
role for the folder to the service account by specifying the folder and service account IDs you previously saved:yc resource-manager folder add-access-binding <folder_ID> \ --role ai.speechkit-tts.user \ --subject serviceAccount:<service_account_ID>
For more information about the
yc resource-manager folder add-access-binding
command, see the CLI reference.
To create a service account, use the create method for the ServiceAccount resource or the ServiceAccountService/Create gRPC API call.
To assign the ai.speechkit-tts.user
role for the folder to the service account, use the setAccessBindings method for the ServiceAccount resource or the ServiceAccountService/SetAccessBindings gRPC API call.
Create an API gateway
-
In the management console
, select the folder where you want to create an API gateway. -
In the list of services, select API Gateway.
-
Click Create API gateway.
-
In the Name field, enter
speechkit-api-gw
. -
In the Specification section, add the following specification and provide the
speechkit-sa
service account ID in theservice_account_id
parameter:openapi: 3.0.0 info: title: Sample API version: 1.0.0 paths: /synthesis: post: requestBody: description: "/synthesis" content: application/json: schema: type: object x-yc-schema-mapping: type: static template: text: "${.text}" hints: - voice: "lera" - role: "friendly" - audioTemplate: audio: audioSpec: containerAudio: containerAudioType: "MP3" responses: 200: description: "/synthesis" content: application/json: schema: type: object x-yc-schema-mapping: type: static template: data: "${.result.audioChunk.data}" x-yc-apigateway-integration: http_method: post type: http url: https://tts.api.cloud.yandex.net/tts/v3/utteranceSynthesis service_account_id: "<service_account_ID>"
-
Click Create.
-
Wait until the status of the API gateway you just created switches to
running
, and then click the row with the gateway name. -
In the window that opens, copy the Default domain field value. You will need it later to check how the API gateway works.
-
Save the following specification to the
speechkit-gw.yaml
file and provide thespeechkit-sa
service account ID in theservice_account_id
parameter:openapi: 3.0.0 info: title: Sample API version: 1.0.0 paths: /synthesis: post: requestBody: description: "/synthesis" content: application/json: schema: type: object x-yc-schema-mapping: type: static template: text: "${.text}" hints: - voice: "lera" - role: "friendly" - audioTemplate: audio: audioSpec: containerAudio: containerAudioType: "MP3" responses: 200: description: "/synthesis" content: application/json: schema: type: object x-yc-schema-mapping: type: static template: data: "${.result.audioChunk.data}" x-yc-apigateway-integration: http_method: post type: http url: https://tts.api.cloud.yandex.net/tts/v3/utteranceSynthesis service_account_id: "<service_account_ID>"
-
Run this command:
yc serverless api-gateway create \ --name speechkit-api-gw \ --spec=speechkit-gw.yaml
Where:
--name
: API gateway name.--spec
: Path to the specification file.
Result:
done (2s) id: d5ddbmungf72******** folder_id: b1gt6g8ht345******** created_at: "2024-08-19T18:58:32.101Z" name: speechkit-api-gw status: ACTIVE domain: d5ddbmungf72********.apigw.yandexcloud.net connectivity: {} log_options: folder_id: b1gt6g8ht345******** execution_timeout: 300s
Save the service domain (the domain
filed value) of the API gateway you created. You will need it later to check how the API gateway works.
For more information about the yc serverless api-gateway create
command, see the CLI reference.
To create an API gateway, use the create REST API method for the ApiGateway resource or the ApiGatewayService/Create gRPC API call.
Check the result
Send a request to your API gateway by providing the service domain value you previously saved:
curl -v \
https://<service_domain>/synthesis \
-d '{"text": “Hello! S+erverless Api G+ateway now has a new feature: converting HTTP request or response body!"}' \
| jq -r '.data' | while read chunk; do base64 -d <<< "$chunk" >> audio.mp3; done
Once you run the above command, the synthesized speech will be saved to the audio.mp3
file in the current directory. You can listen to the file you created in your browser, e.g., Yandex Browser
To learn more about the format of the text provided in the -d
parameter, see the Yandex SpeechKit documentation.
How to delete the resources you created
If you no longer need the resources you created: