Speech synthesis using the Python SDK
Below is an example of speech synthesis from text in TTS markup into a WAV
The other speech synthesis settings
To use the Python SDK, the yandex-speechkit
package is required.
Authentication is performed under a service account using an API key or IAM token. Learn more about authentication in the SpeechKit API.
Getting started
- Create a service account and assign the
ai.speechkit-tts.user
role to it. - Get an API key for the service account and save it.
Create an application for speech synthesis
-
Install the
yandex-speechkit
package using the pip package manager:pip install yandex-speechkit
The installation was tested on Python 3.9. For the minimum allowed Python version, see the SDK website
.If a
grpcio-tools
package version conflict occurs, see Resolving version conflicts during the installation of Python SDK. -
Create a file named
test.py
and add the following code to it:from argparse import ArgumentParser from speechkit import model_repository, configure_credentials, creds # Authentication with an API key. configure_credentials( yandex_credentials=creds.YandexCredentials( api_key='<API_key>' ) ) def synthesize(text, export_path): model = model_repository.synthesis_model() # Specify the synthesis settings. model.voice = 'jane' model.role = 'good' # Performing speech synthesis and creating the output audio file. result = model.synthesize(text, raw_format=False) result.export(export_path, 'wav') if __name__ == '__main__': parser = ArgumentParser() parser.add_argument('--text', type=str, help='text to synthesize', required=True) parser.add_argument('--export', type=str, help='export path for synthesized audio', required=False) args = parser.parse_args() synthesize(args.text, args.export)
Where:
api_key
: Service account API key.voice
: Voice for speech synthesis.role
: Role for the specified voice.text
: Text for synthesis in TTS markup.export_path
: Path to the file to save the audio to.
-
Enter text to be converted into speech:
export TEXT='I'm Yandex Speech+Kit. I can turn any text into speech. Now y+ou can, too!'
-
Run the created file:
python3 test.py --text ${TEXT} --export speech.wav
Where:
--text
: Text for synthesis in TTS markup.--export
: Path to the file to save the audio to.
This will create the
speech.wav
file with synthesized speech.