Speech synthesis from SSML text using API v1
Written by
Updated at October 24, 2024
With the API v1, you can synthesize speech from text marked up using SSML to an OggOpus file.
The example uses the following synthesis parameters:
The text file is read using the cat
The Yandex account or federated account are authenticated using an IAM token. If you use your service account, you do not need to include the folder ID in the request. To learn more about SpeechKit API authentication, see Authentication with the SpeechKit API.
Bash
-
Create a file, e.g.,
text.xml
, and add to it this text in SSML format:<speak> Here are some examples of how you can use SSML. You can add a custom pause to your text:<break time="2s"/> Ta-daaah! Or mark up your text into paragraphs and sentences. Pauses between paragraphs are longer. <p><s>Sentence one</s><s>Sentence two</s></p> You can also substitute phrases. For example, you can use this feature to pronounce abbreviations, <sub alias="et cetera">etc.</sub> </speak>
-
Send a request with the text to the server:
export FOLDER_ID=<folder_ID> export IAM_TOKEN=<IAM_token> curl \ --request POST \ --header "Authorization: Bearer ${IAM_TOKEN}" \ --data-urlencode "ssml=`cat text.xml`" \ --data "lang=ru-RU&voice=jane&folderId=${FOLDER_ID}" \ "https://tts.api.cloud.yandex.net/speech/v1/tts:synthesize" > speech.ogg
Where:
The synthesized speech will be written to the speech.ogg
file in the folder you sent your request from.