Filtering by parameters in REST API
Filtering by parameters allows you to find dialogs matching the specified conditions. The query returns IDs of matching dialogs. For details on how to get information about a dialog by its ID, see this guide.
You can use filtering separately or as an addition to full-text search. When used with full-text search, the response will return only query IDs satisfying both the full-text search criteria and additional filters.
Getting started
To search data via the Yandex Cloud REST API:
-
In the management console, create a service account.
- Add the service account to the namespace with the
Data viewer
role. This will allow your service account to work with data in SpeechSense. -
To authenticate to the Yandex Cloud API, create an API key or IAM token for the service account.
- Upload voice call recordings or chat transcripts to SpeechSense.
Filtering by parameters
-
Create a file named
search.json
and specify in it the requiredfilters
IDs and filters:{ "organizationId": "<organization_ID>", "spaceId": "<space_ID>", "connectionId": "<connection_ID>", "projectId": "<project_ID>", "filters": [ { "key": "<dialog_feature_used_for_filtering>", "channelNumber": "<channel_number>", // Specify one or more filters "anyMatch": { "values": [ "<search_query>" ] }, "intRange": { "fromValue": "<lower_limit>", "toValue": "<upper_limit>", "boundsInclusive": { "fromInclusive": "<enable_lower_limit:_true_or_false>", "toInclusive": "<enable_upper_limit:_true_or_false>" } }, "doubleRange": { "fromValue": "<lower_limit>", "toValue": "<upper_limit>", "boundsInclusive": { "fromInclusive": "<enable_lower_limit:_true_or_false>", "toInclusive": "<enable_upper_limit:_true_or_false>" } }, "dateRange": { "fromValue": "<lower_limit>", "toValue": "<upper_limit>", "boundsInclusive": { "fromInclusive": "<enable_lower_limit:_true_or_false>", "toInclusive": "<enable_upper_limit:_true_or_false>" } }, "durationRange": { "fromValue": "<lower_limit>", "toValue": "<upper_limit>", "boundsInclusive": { "fromInclusive": "<enable_lower_limit:_true_or_false>", "toInclusive": "<enable_upper_limit:_true_or_false>" } }, "booleanMatch": { "value": "<filter_by_true_or_false>" } } ], "sort_data": { "fields": [{ "field": "<dialog_feature_used_for_sorting>", "order": "<sort_order:_ascending_or_descending>", "position": "<sort_field_priority>" }] }, "pageSize": "<number_of_documents_per_page>", "pageToken": "<next_page_token_with_filtering_results>" }
Where:
organizationId
: ID of the organization the request takes place in. To get the ID, go to Cloud Center and click under the name of the organization in the section.spaceId
: ID of the space the request takes place in. To get the ID, go to SpeechSense , open the page of the space you need and click ID.connectionId
: ID of the connection the request takes place in. To get the ID, go to SpeechSense , open the page of the space you need. On the Connection tab, open the page of the connection and click ID.projectId
: ID of the project the request takes place in. To get an ID, go to SpeechSense , open the page of the space you need. On the Projects tab, open the page of the project and click ID.
-
filters
: Request body for filtering by individual parameters. Supports the following parameters:-
key
: Dialog feature you are filtering by. The possible values are:-
userMeta.<field_name>
: Filtering by metadata. Here<field_name>
is the metadata field specified when uploading the dialog. Example:userMeta.date
. -
talk.classifiers.<classifier_name>.count
: Filtering by classifiers. It takes into account the number of times a certain classifier has been triggered in a dialog. -
talk.summarization.points.<question_ID>
: Filtering by dialog summary. You can get question IDs from the dialog summary together with the dialog data. -
talk.statistics.<statistics_name>
: Filtering by statistics (for audio only):talk.statistics.duration_seconds
: Dialog duration in seconds.talk.statistics.simultaneous_silence.duration_seconds
: Simultaneous silence duration in seconds.talk.statistics.simultaneous_silence.ratio
: Ratio of simultaneous silence to total dialog duration.talk.statistics.simultaneous_speech.duration_seconds
: Simultaneous speech duration in seconds.talk.statistics.simultaneous_speech.ratio
: Ratio of simultaneous speech to total dialog duration.talk.statistics.interrupts.count
: Number of dialog partner interruptions.talk.statistics.phrases.count
: Number of phrases in a dialog.talk.statistics.words.count
: Number of words in a dialog.talk.statistics.letters.count
: Number of characters in a dialog.talk.statistics.words.count_per_second
: Number of words per second in the channel specified in thechannelNumber
parameter.talk.statistics.letters.count_per_second
: Number of characters per second in the channel specified in thechannelNumber
parameter.talk.statistics.interrupts.duration_seconds
: Duration of speaker interruptions by another speaker, in seconds. The channel of the interrupting speaker is specified in thechannelNumber
parameter.
-
-
channelNumber
: Channel number. If you specify this number, filtering is applied to metadata, classifier positives, or statistics related to this channel.Channel numbering in chat connections:
0
: Agent channel.1
: Customer channel.2
: Bot channel.
Channel numbering for audio is preset at the connection level and is different from channel numbering for chats.
The following filters are available:
anyMatch
: Specifies whether the metadata, classifier, statistics, or dialog summary fields contain the value from the filter. For example, a filter with thekey = userMeta.ticket_id
andvalues = [123, 345]
parameters will return dialogs with123
or345
in theticket_id
metadata field.intRange
: Checks that the given integer value belongs to the range specified in the filter. Suitable for filtering by classifiers, integer metadata fields, and statistics with integer values.doubleRange
: Checks that the given floating-point number falls within the range specified in the filter. Suitable for filtering by classifiers, metadata fields, and statistics with floating-point values.dateRange
: Checks that the given date value falls within the range specified in the filter.durationRange
: Checks that the given duration falls within the range specified in the filter. Suitable for filtering by dialog duration, interruptions, simultaneous speech or silence.booleanMatch
: Checks that the given value ofboolean
type matches the value in the filter,True
orFalse
. Suitable for filtering by dialog summary andboolean
metadata fields.
You can set the
boundsInclusive
parameter for each filter. It indicates whether to include range limits in the filter:fromInclusive
: Lower limit.toInclusive
: Upper limit.
-
-
sort_data
: Data sorting parameters in response to the request.fields
: List of dialog features you are sorting by. Supports the following parameters:field
: Dialog feature you are sorting by.order
: Sort order: ascending or descending.position
: Sorting field priority (when sorting by several dialog features at the same time).
-
pageSize
: Number of documents per page. -
pageToken
: Token of the next page with query results.
If the query results are split into multiple pages, each page has a token of its own. The response to each query contains thenext_page_token
(if there is a next page). Paste it into thepageToken
parameter of your query to get the next page of results.
For more information about search query parameters, see the API reference.
-
Specify the service account's API key:
export API_KEY=<service_account_API_key>
If using an IAM token, provide it instead of the API key:
export IAM_TOKEN=<service_account_IAM_token>
-
Send a search query to the SpeechSense API using cURL:
curl -X POST https://rest-api.speechsense.yandexcloud.net/speechsense/v1/talks/search \ -H "Content-Type: application/json" \ -H "authorization: Api-Key ${API_KEY}" \ -d @search.json
Where
Api-Key
is the API key for authentication. If using an IAM token, specifyBearer ${IAM_TOKEN}
instead ofApi-Key ${API_KEY}
.Dialog IDs that meet the filtering criteria will be output to the terminal in JSON format.
Request body example for filtering by individual parameters
For example, you need to find all the dialogs with the provider's support between 11:00 and 12:00 on September 24, 2024. Your JSON file with the request parameters will look as follows:
{
"organizationId": "yc.organization****************",
"spaceId": "f3fuclf1kufs********",
"connectionId": "eag0u346n4hn********",
"projectId": "eag9t3rm3o43********",
"filters": [
{
"key": "userMeta.date",
"date_range": {
"from_value": "2024-09-24T11:00:00Z",
"to_value": "2024-09-24T12:00:00Z"
}
}
]
}
Query results:
{
"talk_ids": [
"aud95sn63lra********"
],
"talks_count": "1",
"next_page_token": ""
}
Where:
talk_ids
: IDs of dialogs that meet the filtering criteria.talks_count
: Number of dialogs that meet the filtering criteria.next_page_token
: Token of the next page with results. If the results are split into multiple pages, this token is used in the next query to get the next page. If this field is returned empty, the results end on the current page.