How to improve the accuracy of translations
To increase the accuracy of translations:
- Specify the source language. Some words are written the same in different languages, but have different meanings. If the model detects the wrong source language, these words are translated differently.
- Specify your translation glossary. A word can be translated different ways. For example, the English word oil can be translated into Russian as масло or нефть. You can use a glossary to indicate the proper translation of a word or phrase. You can learn more about glossaries here.
Getting started
To use the examples, install cURL
The examples below are intended to be run in MacOS and Linux. To run them in Windows, see how to work with Bash in Microsoft Windows.
The Translate API requires you to send your authentication credentials in each request. The authentication method depends on the type of account used to send your request:
-
Get an IAM token to authenticate your Yandex account or federated account. Transmit the token in the
Authorization
header of each request in the following format:Authorization: Bearer <IAM token>
-
Get the ID of any folder for which your account has the
ai.translate.user
role or higher. Make sure to include your folder ID in thefolderId
field in the body of each request.
-
Choose one of the authentication methods:
-
Get an IAM token. Include the IAM token in the
Authorization
header in the following format:Authorization: Bearer <IAM token>
-
Create an API key. Include the API key in the
Authorization
header in the following format:Authorization: Api-Key <API key>
-
-
Assign the service account the
ai.translate.user
role or higher for the folder where it was created.Do not specify the folder ID in your requests, as YandexGPT uses the folder in which the service account was created.
Specify the source language
Words are sometimes written the same in different languages but translated differently. For example, the word angel
means a spiritual being in English, while in German it means a fishing rod. If the text you provide contains such words, Translate may detect the wrong source language.
To avoid mistakes, specify the source language in the sourceLanguageCode
field:
{
"folderId": "<folder_ID>",
"texts": ["angel"],
"targetLanguageCode": "ru",
"sourceLanguageCode": "de"
}
Where:
folderId
: Folder ID received before starting.texts
: Text to translate as a list of strings.targetLanguageCode
: Target language. You can get the language code together with a list of supported languages.sourceLanguageCode
: Source language.
Save the request body to a file (for example, body.json
) and submit the file using the translate method:
export IAM_TOKEN=<IAM_token>
curl \
--request POST \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${IAM_TOKEN}" \
--data '@<path_to_JSON_file>' \
"https://translate.api.cloud.yandex.net/translate/v2/translate"
Where IAM_TOKEN
is the IAM token received before starting.
This returns a translation from the correct language:
{
"translations": [
{
"text": "fishing rod"
}
]
}
Specify your translation glossary
A word can be translated different ways. For example, the English word oil can be translated into Russian as масло or нефть. To improve the accuracy of translations, use a glossary of your terms and phrases with a single translation.
Specify the glossary in the glossaryConfig
field. Currently, you can only pass a glossary as an array of text pairs.
In the sourceLanguageCode
field, specify the source language. This field is required when you use glossaries:
{
"sourceLanguageCode": "tr",
"targetLanguageCode": "ru",
"texts": [
"cırtlı çocuk spor ayakkabı"
],
"folderId": "<folder_ID>",
"glossaryConfig": {
"glossaryData": {
"glossaryPairs": [
{
"sourceText": "spor ayakkabı",
"translatedText": "sneakers"
}
]
}
}
}
Where:
sourceLanguageCode
: Source language. You can get the language code together with a list of supported languages.targetLanguageCode
: Target language.texts
: Text to translate as a list of strings.folderId
: Folder ID received before starting.
Save the request body to a file (for example, body.json
) and submit the file using the translate method:
export IAM_TOKEN=<IAM_token>
curl \
--request POST \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${IAM_TOKEN}" \
--data '@<path_to_JSON_file>' \
"https://translate.api.cloud.yandex.net/translate/v2/translate"
Where IAM_TOKEN
is the IAM token received before starting.
The response will contain a translation based on the terms from your glossary:
{
"translations": [
{
"text": "Children's sneakers with velcro"
}
]
}
Without the glossary, the translation would be:
{
"translations": [
{
"text": "Children's sport shoes with velcro"
}
]
}
Escaping text
To skip translation of certain text fragments, specify the HTML
text format in the request body and escape the fragments that do not require translation using the <span>
tag with the translate=no
attribute. For example:
{
"format": "HTML",
"texts": [
"The e-mail has been changed. The new password is **<span translate=no>**%\$Qvd14aa2NMc**</span>**"
]
}
Where:
format
: Text format.texts
: Text to translate as a list of strings.
The response will contain untranslated text inside the <span>
tag:
{
"translations": [
{
"text": "L'e-mail a été modifié. Le nouveau mot de passe est **<span translate="no">**%\$Qvd14aa2NMc**</span>**"
}
]
}
Checking words for typos
Misspelled words may be translated incorrectly or transliterated. For example, the word hellas
is translated as эллада
. If the same word is misspelled, let's say as helas
, it will be translated as хелас
. To check spelling, use the speller
parameter:
{
"sourceLanguageCode": "en"
"targetLanguageCode": "ru",
"texts": [
"helas"
],
"folderId": "<folder_ID>",
"speller": true
}
Where:
sourceLanguageCode
: Source language. You can get the language code with a list of supported languages.targetLanguageCode
: Target language.texts
: Text to translate as a list of strings.folderId
: Folder ID received before starting.speller
: Parameter that enables a spelling check.
Save the request body to a file (for example, body.json
) and submit the file using the translate method:
export IAM_TOKEN=<IAM_token>
curl \
--request POST \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${IAM_TOKEN}" \
--data '@<path_to_JSON_file>' \
"https://translate.api.cloud.yandex.net/translate/v2/translate"
Where IAM_TOKEN
is the IAM token received before starting.
The response will contain a translation of the word checked for spelling:
{
"translations": [
{
"text": "эллада"
}
]
}
If no spelling check is enabled ("speller": false
), the word will be translated as follows:
{
"translations": [
{
"text": "хелас"
}
]
}