Supported audio formats
SpeechKit allows you to recognize and synthesize the following audio formats:
- LPCM
- OggOpus
- MP3
LPCM
Linear pulse-code modulation
Audio features in this format:
-
Sampling frequency:
API version Acceptable values Speech synthesis API v1 8, 16, or 48 kHz Speech synthesis API v3 Any value between 8 and 48 kHz Speech recognition API v2 8, 16, or 48 kHz Speech recognition API v3 8, 16, or 48 kHz -
Bit depth: 16 bit.
-
Byte order: Reversed (little-endian).
-
Audio data is stored as signed integers.
OggOpus
For OggOpus
SpeechKit recognizes and synthesizes OggOpus without audio file quality and header restrictions.
MP3
For MP3
SpeechKit recognizes MP3 without audio file quality and header restrictions.
Warning
The MP3 format is not supported in the API v1 for synchronous recognition and API v2 for streaming recognition.