How to get started with the Search API using API v2
API v2 is the latest and most recommended interface for Search API. API v2 is fully integrated into the Yandex Cloud ecosystem and supports both API key authentication as well as the more secure authentication based on short-lived IAM tokens.
Getting started
Sign up for Yandex Cloud and create a billing account:
- Go to the management console
and log in to Yandex Cloud or create an account if you do not have one yet. - On the Yandex Cloud Billing
page, make sure you have a billing account linked and it has theACTIVE
orTRIAL_ACTIVE
status. If you do not have a billing account, create one.
If you have an active billing account, you can go to the cloud page
Learn more about clouds and folders.
To use the examples, install the cURL
Prepare your cloud
-
For authenticating with the API v2 as a service account, create a service account.
-
Assign the
search-api.webSearch.user
role to the user or service account you will use to run queries. -
Get an IAM token, which is required for authentication.
The following examples use IAM token authentication. To use a service account's API key for authentication, edit the
Authorization
header in the query examples. For more information, see Authentication in API v2.
Create a search query
This request example returns the fifth page of search results for the Yandex
query. Search type: Russian
. Search region: Novosibirsk Oblast. Notification language: Russian. The family filter will be applied to the search results. The number of passages is three. The results are grouped by domain and sorted by relevance. Each group contains three documents, and the number of groups returned per page is five.
For more information about the request body parameters, see Query body format.
-
Create a file with the request body, e.g.,
body.json
, specify the ID of the folder you are going to use to work with Search API in thefolderId
field:body.json
{ "query": { "searchType": "SEARCH_TYPE_RU", "queryText": "Yandex”, "familyMode": "FAMILY_MODE_STRICT", "page": "4" }, "sortSpec": { "sortMode": "SORT_MODE_BY_RELEVANCE", "sortOrder": "SORT_ORDER_DESC" }, "groupSpec": { "groupMode": "GROUP_MODE_DEEP", "groupsOnPage": "5", "docsInGroup": "3" }, "maxPassages": "3", "region": "65", "l10N": "LOCALIZATION_RU", "folderId": "<folder_ID>" }
-
Run an http query by specifying the IAM token you got earlier:
curl \ --request POST \ --header "Authorization: Bearer <IAM_token>" \ --data "@body.json" \ "https://searchapi.api.cloud.yandex.net/v2/web/searchAsync"
Result:
{ "done": false, "id": "sppger465oq1********", "description": "WEB search async", "createdAt": "2024-10-02T19:51:02Z", "createdBy": "bfbud0oddqp4********", "modifiedAt": "2024-10-02T19:51:03Z" }
-
Create a file with the query body, e.g.,
body.json
, specify the ID of the folder you are going to use to work with Search API in thefolderId
field:body.json
{ "query": { "search_type": "SEARCH_TYPE_RU", "query_text": "Yandex”, "family_mode": "FAMILY_MODE_STRICT", "page": "4" }, "sort_spec": { "sort_mode": "SORT_MODE_BY_RELEVANCE", "sort_order": "SORT_ORDER_DESC" }, "group_spec": { "group_mode": "GROUP_MODE_DEEP", "groups_on_page": "5", "docs_in_group": "3" }, "max_passages": "3", "region": "65", "l10n": "LOCALIZATION_RU", "folder_id": "<folder_ID>" }
-
Run an gRPC call by specifying the IAM token you got earlier:
grpcurl \ -rpc-header "Authorization: Bearer <IAM_token>" \ -d @ < body.json \ searchapi.api.cloud.yandex.net:443 yandex.cloud.searchapi.v2.WebSearchAsyncService/Search
Result:
{ "id": "spp3gp3vhna6********", "description": "WEB search async", "createdAt": "2024-10-02T19:14:41Z", "createdBy": "bfbud0oddqp4********", "modifiedAt": "2024-10-02T19:14:42Z" }
Save the obtained Operation object ID (id
value) for later use.
Make sure the request was executed successfully
Wait until Search API executes the request and generates a response. This may take from five minutes to a few hours.
Make sure the query was executed successfully:
Run an http query:
curl \
--request GET \
--header "Authorization: Bearer <IAM_token>" \
https://operation.api.cloud.yandex.net/operations/<query_ID>
Where:
<IAM_token>
: Previously obtained IAM token.<query_ID>
: The Operation object ID you saved at the previous step.
Result:
{
"done": true,
"response": {
"@type": "type.googleapis.com/yandex.cloud.searchapi.v2.WebSearchResponse",
"rawData": "<Base64_encoded_XML_response_body>"
},
"id": "spp82pc07ebl********",
"description": "WEB search async",
"createdAt": "2024-10-03T08:07:07Z",
"createdBy": "bfbud0oddqp4********",
"modifiedAt": "2024-10-03T08:12:09Z"
}
Run this gRPC call:
grpcurl \
-rpc-header "Authorization: Bearer <IAM_token>" \
-d '{"operation_id": "<query_ID>"}' \
operation.api.cloud.yandex.net:443 yandex.cloud.operation.OperationService/Get
Where:
<IAM_token>
: Previously obtained IAM token.<query_ID>
: The Operation object ID you saved at the previous step.
Result:
{
"id": "spp82pc07ebl********",
"description": "WEB search async",
"createdAt": "2024-10-03T08:07:07Z",
"createdBy": "bfbud0oddqp4********",
"modifiedAt": "2024-10-03T08:12:09Z",
"done": true,
"response": {
"@type": "type.googleapis.com/yandex.cloud.searchapi.v2.WebSearchResponse",
"rawData": "<Base64_encoded_XML_response_body>"
}
}
If the done
field is set to true
and the response
object is present in the output, the query has been completed successfully, so you can move on to the next step. Otherwise, repeat the check later.
Get a response
After Search API has successfully processed the query:
-
Get the result:
REST APIgRPC APIcurl \ --request GET \ --header "Authorization: Bearer <IAM_token>" \ https://operation.api.cloud.yandex.net/operations/<query_ID> \ > result.json
grpcurl \ -rpc-header "Authorization: Bearer <IAM_token>" \ -d '{"operation_id": "<query_ID>"}' \ operation.api.cloud.yandex.net:443 yandex.cloud.operation.OperationService/Get \ > result.json
Eventually the search query result will be saved to a file named
result.json
containing a Base64-encoded XML response in theresponse.rawData
field. -
Decode the result from
Base64
:echo "$(< result.json)" | \ jq -r .response.rawData | \ base64 --decode > result.xml
The XML response to the query will be saved to a file named
result.xml
.