Speech synthesis in the REST API v3
Written by
Updated at October 24, 2024
You can use the REST API v3 in SpeechKit to synthesize speech if you do not need the benefits of the gRPC API.
The example uses the following synthesis parameters:
- Voice:
marina
- Role:
friendly
- Audio format: WAV (default)
Authentication takes place under a Yandex account or a federated account using an IAM token. Authentication requires the ID of the folder in which the user has the ai.speechkit-tts.user
role to work with SpeechKit. If you use your service account, you do not need to include the folder ID in the request. To learn more about SpeechKit API authentication, see Authentication with the SpeechKit API.
cURL
To reproduce this example, you will need the jq
- Create the
request.json
file with the following request parameters:
{"text": "Hello! I'm Yandex Speech+Kit. I can turn any text into speech. Now y+ou can, too!", "hints": [{"voice": "marina"}, {"role": "friendly"}]}
Where:
text
: Text to synthesizehints
: List of synthesis parameters:voice
: Voice for synthesisrole
: Role
- Get the folder ID and the IAM token for the account you will use with SpeechKit, and include them in the request headers.
export FOLDER_ID=<folder_ID>
export IAM_TOKEN=<IAM_token>
curl \
--header "Authorization: Bearer $IAM_TOKEN" \
--header "x-folder-id: $FOLDER_ID" \
--data @request.json https://tts.api.cloud.yandex.net:443/tts/v3/utteranceSynthesis | \
jq -r '.result.audioChunk.data' | \
while read chunk; do base64 -d <<< "$chunk" >> audio.wav; done
Where:
FOLDER_ID
is the ID of the folder for which your account has theai.speechkit-tts.user
role or higher.IAM_TOKEN
: IAM token of your Yandex account or federated account.
The synthesized speech will be Base64 encoded and saved to a file named audio.wav
.