Text search with API v2
Note
This feature is in the Preview stage.
With Search API API v2, you can run queries to the Yandex search database and get a response in deferred (asynchronous) mode. You can run queries using REST API and gPRC API. The search results you get depend on the parameters specified in your query.
Queries can be submitted by a user or service account with the search-api.webSearch.user
role.
In response to a deferred query, Search API returns an Operation object containing the operation info: status, ID, call time, etc.
With the Operation object ID, you can track the query execution status and get the results once the processing is complete.
Query body format
The names of the query body fields are different in REST API and gPRC API: the former uses CamelCase
{
"query": {
"searchType": "<search_type>",
"queryText": "<search_query_text>",
"familyMode": "<result_filter_setting_value>",
"page": "<page_number>"
},
"sortSpec": {
"sortMode": "<result_sorting_rule>",
"sortOrder": "<result_sorting_order>"
},
"groupSpec": {
"groupMode": "<result_grouping_method>",
"groupsOnPage": "<number_of_groups_per_page>",
"docsInGroup": "<number_of_documents_per_group>"
},
"maxPassages": "<maximum_number_of_passages>",
"region": "<region_ID>",
"l10N": "<notification_language>",
"folderId": "<folder_ID>"
}
Where:
-
searchType
: Search type. The possible values are:SEARCH_TYPE_RU
: For theRussian
search type.SEARCH_TYPE_TR
: For theTurkish
search type.SEARCH_TYPE_COM
: For theInternational
search type.
-
queryText
: Search query text. The maximum length is 400 characters. -
familyMode
: Results filtering. This is an optional parameter. The possible values are:FAMILY_MODE_MODERATE
: Moderate filter (default). Documents of the Adult category are excluded from search results unless a query is explicitly made for searching resources of this category.FAMILY_MODE_NONE
: Filtering is disabled. Search results include any documents regardless of their contents.FAMILY_MODE_STRICT
: Family filter. Regardless of a search query, documents of the Adult category and those with profanity are excluded from search results.
-
page
: Requested page number. This is an optional parameter. By default, the first page with search results is returned. Page numbering starts from zero (0
stands for page 1). -
sortMode
: Rule defining the search results sorting mode. This is an optional parameter. The possible values are:SORT_MODE_BY_RELEVANCE
: Sorting by relevance (default).SORT_MODE_BY_TIME
: Sorting by document update time.
-
sortOrder
: Search results sorting order. This is an optional parameter. The possible values are:SORT_ORDER_DESC
: Forward sorting order from most recent to oldest (default).SORT_ORDER_ASC
: Reverse sorting order from oldest to most recent.
-
groupMode
: Result grouping method. This is an optional parameter. The possible values are:GROUP_MODE_DEEP
: Grouping by domain. Each group contains documents from one domain (default).GROUP_MODE_FLAT
: Flat grouping. Each group contains a single document.
-
groupsOnPage
: Maximum number of groups that can be returned per search results page. This is an optional parameter. The values range from1
to100
. Default value:20
. -
docsInGroup
: Maximum number of documents that can be returned per group. This is an optional parameter. The values range from1
to3
. Default value:1
. -
maxPassages
: Maximum number of passages that can be used when generating a document snippet. This is an optional parameter. The values range from1
to5
. By default, a maximum of four passages with search query text is returned per document. -
region
: Search country or region ID that affects the document ranking rules. Only supported forRussian
andTurkish
search types.For a list of frequently used country and region IDs, see Search regions.
-
l10N
: Search response notifications language. Affects the text in thefound-docs-human
tag and error messages. This is an optional parameter. Possible values depend on the selected search type:-
Russian
:LOCALIZATION_RU
: Russian (default).LOCALIZATION_BE
: Belarusian.LOCALIZATION_KK
: Kazakh.LOCALIZATION_UK
: Ukrainian.
-
Turkish
:LOCALIZATION_TR
: Turkish.
-
International
:LOCALIZATION_EN
: English.
-
-
folderId
: Folder ID of the user or service account you will use for queries.
{
"query": {
"search_type": "<search_type>",
"query_text": "<search_query_text>",
"family_mode": "<result_filter_setting_value>",
"page": "<page_number>"
},
"sort_spec": {
"sort_mode": "<result_sorting_rule>",
"sort_order": "<sort_order_of_results>"
},
"group_spec": {
"group_mode": "<result_grouping_method>",
"groups_on_page": "<number_of_groups_per_page>",
"docs_in_group": "<number_of_documents_per_group>"
},
"max_passages": "<maximum_number_of_passages>",
"region": "<region_ID>",
"l10n": "<notification_language>",
"folder_id": "<folder_ID>"
}
Where:
-
search_type
: Search type. The possible values are:SEARCH_TYPE_RU
: For theRussian
search type.SEARCH_TYPE_TR
: For theTurkish
search type.SEARCH_TYPE_COM
: For theInternational
search type.
-
query_text
: Search query text. The maximum length is 400 characters. -
family_mode
: Results filtering. This is an optional parameter. The possible values are:FAMILY_MODE_MODERATE
: Moderate filter (default). Documents of the Adult category are excluded from search results unless a query is explicitly made for searching resources of this category.FAMILY_MODE_NONE
: Filtering is disabled. Search results include any documents regardless of their contents.FAMILY_MODE_STRICT
: Family filter. Regardless of a search query, documents of the Adult category and those with profanity are excluded from search results.
-
page
: Requested page number. This is an optional parameter. By default, the first page with search results is returned. Page numbering starts from zero (0
stands for page 1). -
sort_mode
: Rule defining the search results sorting mode. This is an optional parameter. The possible values are:SORT_MODE_BY_RELEVANCE
: Sorting by relevance (default).SORT_MODE_BY_TIME
: Sorting by document update time.
-
sort_order
: Search results sorting order. This is an optional parameter. The possible values are:SORT_ORDER_DESC
: Forward sorting order from most recent to oldest (default).SORT_ORDER_ASC
: Reverse sorting order from oldest to most recent.
-
group_mode
: Result grouping method. This is an optional parameter. The possible values are:GROUP_MODE_DEEP
: Grouping by domain. Each group contains documents from one domain (default).GROUP_MODE_FLAT
: Flat grouping. Each group contains a single document.
-
groups_on_page
: Maximum number of groups that can be returned per search results page. This is an optional parameter. The values range from1
to100
. Default value:20
. -
docs_in_group
: Maximum number of documents that can be returned per group. This is an optional parameter. The values range from1
to3
. Default value:1
. -
max_passages
: Maximum number of passages that can be used when generating a document snippet. This is an optional parameter. The values range from1
to5
. By default, a maximum of four passages with search query text is returned per document. -
region
: Search country or region ID that affects the document ranking rules. Only supported for theRussian
andTurkish
search types.For a list of frequently used country and region IDs, see Search regions.
-
l10n
: Search response notifications language. Affects the text in thefound-docs-human
tag and error messages. This is an optional parameter. Possible values depend on the selected search type:-
Russian
:LOCALIZATION_RU
: Russian (default).LOCALIZATION_BE
: Belarusian.LOCALIZATION_KK
: Kazakh.LOCALIZATION_UK
: Ukrainian.
-
Turkish
:LOCALIZATION_TR
: Turkish.
-
International
:LOCALIZATION_EN
: English.
-
-
folder_id
: Folder ID of the user or service account you will use for queries.
Response format
In response to a deferred query, Search API returns an Operation object in the following format:
{
"done": true,
"response": {
"@type": "type.googleapis.com/yandex.cloud.searchapi.v2.WebSearchResponse",
"rawData": "<Base64_encoded_XML_response_body>"
},
"id": "<operation_object_ID>",
"description": "WEB search async",
"createdAt": "2024-10-03T08:07:07Z",
"createdBy": "<subject_ID>",
"modifiedAt": "2024-10-03T08:12:09Z"
}
{
"id": "<operation_object_ID>",
"description": "WEB search async",
"createdAt": "2024-10-03T08:07:07Z",
"createdBy": "<subject_ID>",
"modifiedAt": "2024-10-03T08:12:09Z",
"done": true,
"response": {
"@type": "type.googleapis.com/yandex.cloud.searchapi.v2.WebSearchResponse",
"rawData": "<Base64_encoded_XML_response_body>"
}
}
The response
object within the Operation object becomes available only after the query is executed on the Search API side and the done
(operation status) field value changes to true
.
The response
object's rawData
field value contains the Base64
For more information about the Operation object and its fields, see Operation object.