Using Yandex API Gateway to set up speech synthesis in Yandex SpeechKit
With serverless technology, you can create your own integration with Yandex Cloud services.
In this tutorial, you will create a custom setup with an OpenAPI 3.0
The users' speech synthesis requests run through the API gateway that uses HTTP integration to call the SpeechKit API and retrieve the synthesized speech from SpeechKit.
To set up SpeechKit speech synthesis using Yandex API Gateway:
If you no longer need the resources you created, delete them.
Getting started
Sign up in Yandex Cloud and create a billing account:
- Navigate to the management console
and log in to Yandex Cloud or register a new account. - On the Yandex Cloud Billing
page, make sure you have a billing account linked and it has theACTIVE
orTRIAL_ACTIVE
status. If you do not have a billing account, create one and link a cloud to it.
If you have an active billing account, you can navigate to the cloud page
Learn more about clouds and folders.
Required paid resources
The new infrastructure support cost includes:
- Fee for the number of requests to the API gateway and outbound traffic (see Yandex API Gateway pricing).
- Fee for using SpeechKit (see SpeechKit pricing).
Create a service account
Create a service account named speechkit-sa
with the ai.speechkit-tts.user
role for the folder where you are creating your infrastructure:
- In the management console
, select the folder where you want to create a service account. - In the list of services, select Identity and Access Management.
- Click Create service account.
- Enter the service account name:
speechkit-sa
. - Click
Add role and selectai.speechkit-tts.user
. - Click Create.
If you do not have the Yandex Cloud CLI installed yet, install and initialize it.
By default, the CLI uses the folder specified when creating the profile. To change the default folder, use the yc config set folder-id <folder_ID>
command. You can also set a different folder for any specific command using the --folder-name
or --folder-id
parameter.
-
Create a service account named
speechkit-sa
:yc iam service-account create speechkit-sa
Result:
id: nfersamh4sjq******** folder_id: b1gc1t4cb638******** created_at: "2023-09-21T10:36:29.726397755Z" name: speechkit-sa
Save the ID of the
speechkit-sa
service account (id
) and the ID of the folder where you created it (folder_id
).For more information about the
yc iam service-account create
command, see the CLI reference. -
Assign the
ai.speechkit-tts.user
role for the folder to the service account by specifying the folder and service account IDs you previously saved:yc resource-manager folder add-access-binding <folder_ID> \ --role ai.speechkit-tts.user \ --subject serviceAccount:<service_account_ID>
For more information about the
yc resource-manager folder add-access-binding
command, see the CLI reference.
To create a service account, use the create method for the ServiceAccount resource or the ServiceAccountService/Create gRPC API call.
To assign the ai.speechkit-tts.user
role for a folder to a service account, use the setAccessBindings method for the ServiceAccount resource or the ServiceAccountService/SetAccessBindings gRPC API call.
Create an API gateway
-
In the management console
, select the folder where you want to create an API gateway. -
In the list of services, select API Gateway.
-
Click Create API gateway.
-
In the Name field, enter
speechkit-api-gw
. -
Under Specification, add the following specification and provide the
speechkit-sa
service account ID in theservice_account_id
parameter:openapi: 3.0.0 info: title: Sample API version: 1.0.0 paths: /synthesis: post: requestBody: description: "/synthesis" content: application/json: schema: type: object x-yc-schema-mapping: type: static template: text: "${.text}" hints: - voice: "lera" - role: "friendly" - audioTemplate: audio: audioSpec: containerAudio: containerAudioType: "MP3" responses: 200: description: "/synthesis" content: application/json: schema: type: object x-yc-schema-mapping: type: static template: data: "${.result.audioChunk.data}" x-yc-apigateway-integration: http_method: post type: http url: https://tts.api.cloud.yandex.net/tts/v3/utteranceSynthesis service_account_id: "<service_account_ID>"
-
Click Create.
-
Wait until the status of the API gateway you just created switches to
running
, and then click the row with the gateway name. -
In the window that opens, copy the Default domain field value. You will need it later to test the API gateway.
-
Save the following specification to
speechkit-gw.yaml
and provide thespeechkit-sa
service account ID in theservice_account_id
parameter:openapi: 3.0.0 info: title: Sample API version: 1.0.0 paths: /synthesis: post: requestBody: description: "/synthesis" content: application/json: schema: type: object x-yc-schema-mapping: type: static template: text: "${.text}" hints: - voice: "lera" - role: "friendly" - audioTemplate: audio: audioSpec: containerAudio: containerAudioType: "MP3" responses: 200: description: "/synthesis" content: application/json: schema: type: object x-yc-schema-mapping: type: static template: data: "${.result.audioChunk.data}" x-yc-apigateway-integration: http_method: post type: http url: https://tts.api.cloud.yandex.net/tts/v3/utteranceSynthesis service_account_id: "<service_account_ID>"
-
Run this command:
yc serverless api-gateway create \ --name speechkit-api-gw \ --spec=speechkit-gw.yaml
Where:
--name
: API gateway name.--spec
: Path to the specification file.
Result:
done (2s) id: d5ddbmungf72******** folder_id: b1gt6g8ht345******** created_at: "2024-08-19T18:58:32.101Z" name: speechkit-api-gw status: ACTIVE domain: d5dm1lba80md********.i9******.apigw.yandexcloud.net connectivity: {} log_options: folder_id: b1gt6g8ht345******** execution_timeout: 300s
Save the service domain (the domain
field value) of the API gateway you created. You will need it later to test the API gateway.
For more information about the yc serverless api-gateway create
command, see the CLI reference.
To create an API gateway, use the create REST API method for the ApiGateway resource or the ApiGatewayService/Create gRPC API call.
Check the result
Send a request to your API gateway, providing the service domain value you previously saved:
curl --verbose \
https://<service_domain>/synthesis \
--data '{"text": "Hi! S+erverless Api G+ateway now has a new feature: converting HTTP request or response body!"}' \
| jq -r '.data' | while read chunk; do base64 -d <<< "$chunk" >> audio.mp3; done
After you run the above command, the system will save the synthesized speech to the audio.mp3
file in the current directory. You can listen to the output file in your browser, e.g., Yandex Browser
To learn more about the format of the text provided in the -d
parameter, see this Yandex SpeechKit article.
How to delete the resources you created
If you no longer need the resources you created: