Detecting the text language using Translate
To detect the language of a text, use the detectLanguage method.
Note
The detectLanguage method returns the language code of the source text. If the language cannot be detected, the language code field in the response will be empty.
Getting started
To use the examples, install cURL
The examples below are intended to be run in MacOS and Linux. To run them in Windows, see how to work with Bash in Microsoft Windows.
The Translate API requires you to send your authentication credentials in each request. The authentication method depends on the type of account used to send your request:
-
Get an IAM token to authenticate your Yandex account or federated account. Transmit the token in the
Authorization
header of each request in the following format:Authorization: Bearer <IAM token>
-
Get the ID of any folder for which your account has the
ai.translate.user
role or higher. Make sure to include your folder ID in thefolderId
field in the body of each request.
-
Choose one of the authentication methods:
-
Get an IAM token. Include the IAM token in the
Authorization
header in the following format:Authorization: Bearer <IAM token>
-
Create an API key. Include the API key in the
Authorization
header in the following format:Authorization: Api-Key <API key>
-
-
Assign the service account the
ai.translate.user
role or higher for the folder where it was created.Do not specify the folder ID in your requests, as YandexGPT uses the folder in which the service account was created.
Detect the language of a phrase
In this example, we will detect the language that the phrase Hello, world is written in.
To detect the language of the text, pass it in the detectLanguage request body:
export FOLDER_ID=<folder_ID>
export IAM_TOKEN=<IAM_token>
export TEXT="Hello, world"
curl \
--request POST \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${IAM_TOKEN}" \
--data "{\"folderId\": \"${FOLDER_ID}\", \"text\": \"${TEXT}\"}" \
"https://translate.api.cloud.yandex.net/translate/v2/detect"
Where:
FOLDER_ID
: Folder ID you got before you started.IAM_TOKEN
: IAM token you got before you started.
The service will respond with the language code of the source text:
{
"languageCode": "en"
}
Specify the most likely languages
Some words are spelled the same in different languages. For example, the English word hand
is also written as hand
in German, Swedish, and Dutch. If the text you provide contains such words, Translate may detect the wrong source language.
To avoid mistakes, you can use the languageCodeHints
field to specify which languages should be prioritized when determining the text language:
{
"folderId": "<folder_ID>",
"languageCodeHints":["ru", "de"],
"text": "hand"
}
Where:
folderId
: Folder ID you got before you started.languageCodeHints
: Languages to prioritize when determining the language of the text.text
: Text to translate as a string.
Save the request body to a file (e.g., body.json
) and provide the file using the detectLanguage method:
export IAM_TOKEN=<IAM_token>
curl \
--request POST \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${IAM_TOKEN}" \
--data '@body.json' \
"https://translate.api.cloud.yandex.net/translate/v2/detect"
{
"languageCode": "de"
}
Where IAM_TOKEN
is the IAM token you got before you started.