Yandex Cloud
Search
Contact UsGet started
  • Blog
  • Pricing
  • Documentation
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • ML & AI
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Customer Stories
    • Start testing with double trial credits
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Education and Science
    • Yandex Cloud Partner program
  • Blog
  • Pricing
  • Documentation
© 2025 Direct Cursus Technology L.L.C.
Yandex SpeechKit
  • SpeechKit technology overview
    • About the technology
    • List of voices
      • Overview
      • TTS markup
      • SSML markup
      • List of TTS supported phonemes
      • List of SSML supported phonemes
  • Supported audio formats
  • IVR integration
  • Quotas and limits
  • Access management
  • Pricing policy
  1. Speech synthesis
  2. Text markup
  3. Overview

Text markup for speech synthesis

Written by
Yandex Cloud
Updated at April 11, 2025

You can control pronunciation during speech synthesis by marking up the text you want to synthesize. Yandex SpeechKit fully supports markup for texts in Russian only. Some pronunciation control features are also supported for other languages.

Warning

When using pattern-based synthesis, the markup outside the variable part is ignored.

For Russian and Kazakh, Yandex SpeechKit supports the synthesis of normalized text:

  • Abbreviations do not need to be represented phonetically.
  • You can use Arabic numerals for numbers. During speech synthesis, they are converted into numbers pronounced as words.

Note

SpeechKit is designed for natural speech synthesis. Marking up data for speech synthesis helps set up pronunciation of separate words, phrases, and sentences. However, it is not intended for generating separate sounds and silence.

The markup in the text will serve as a cue for synthesis, not as a direct instruction.

In SpeechKit, there are two markup formats:

  • TTS: For API v1 and API v3.
  • SSML: For API v1 only.

Was the article helpful?

Previous
List of voices
Next
TTS markup
© 2025 Direct Cursus Technology L.L.C.