Query language in Monitoring
This section describes the Yandex Monitoring query language. It is used to convert metrics when you configure dashboards and alerts, as well as in the MetricsData.read API method.
Uploading metrics
Select metrics using the metric name and selectors that filter label values (for more information, see Labels). You can use the sets of metrics you created in alerts or transmit them to a function as an argument.
Specify the name of a metric and required labels,
folderId
andservice
. So, thecpu_usage{folderId="zoeu2rgjpqak********", service="compute"}
request will return the metrics namedcpu_usage
for all Yandex Compute Cloud VMs in the folder with thezoeu2rgjpqak********
ID.
Warning
Consider the folderId
label specifics:
-
The label value must always match the selected folder. You cannot query data from other folders. This applies to all query language use cases: when building charts in Metric Explorer or on dashboards, creating alerts, or calling API methods.
-
When calling API methods, do not include the label value in the request body (in the
query
field). You should providefolderId
as a query parameter in the HTTP request.
Selector consists of a label name, a statement, and an expression that describes a set of label values.
The Yandex Monitoring query language supports the following expressions for filtering label values:
-
label="*"
: Returns all metrics that have the specified label.The
host="*"
selector returns all metrics that have thehost
label. -
label="<glob_expression>"
: Returns all the metrics with labels matching the glob expression .-
*
: Any number of characters (including none).name="folder*"
returns all metrics that have thename
label whose value starts with thefolder
prefix. -
?
: One arbitrary character.name="metric?"
returns all labels in which thename
label value contains one character themetric
. -
|
: All specified options.name="metric1|metric2"
returns two metrics with thename=metric1
andname=metric2
label values.
-
Using query names as variables
The query language supports links to the results of executing other queries as to names of variables.
For example:
A: "temperature"{folderId="my_folder_id", service="custom", room="bedroom", building="home", sensor="sensor1" }
B: "temperature"{folderId="my_folder_id", service="custom", room="bedroom", building="home", sensor="sensor2" }
C: (A + B) / 2
These links can only refer by name in text mode, and only to higher-level queries in the same alert or chart. You can apply any supported arithmetic operations and query language functions to variables.
Data types
The Yandex Monitoring query language supports the following data types:
- timeseries_vector: Set of timeseries (metrics).
- number: Real number.
- string: String in single or double quotes.
- duration: Time period in the format
15s, 10m, 3h, 7d, 2w
. (without quotation marks). - bool: Boolean type which is either
true
orfalse
. - scalar: Real double-precision floating point number based on the IEEE 754 standard
, including the specialNaN
value.
Note
The real number type supports scientific notation with the fraction and power of ten and the following suffixes:
k
: 103M
: 106G
: 109T
: 1012P
: 1015E
: 1018
Functions
Aggregation
Aggregation functions aggregate values of a timeseries in the current time range.
Warning
As an input argument, aggregation functions accept a vector of metrics (timeseries_vector). It must only include a single timeseries. Otherwise, the function returns a runtime error.
When using aggregation functions, make sure that the selector returns a single timeseries. Use combining functions if needed.
avg
Returns an average value (for timeseries, a weighted average) or NaN
for an empty timeseries.
The avg function has the following function overloading options depending on the type of the input arg0 parameter (an array of numbers, a metric, or a vector of metrics):
- avg(arg0: scalar[]): scalar
- avg(arg0: timeseries_vector): scalar
count
Returns the number of points in a metric or the number of items in a vector of numbers.
The count function has the following function overloading options depending on the type of the input arg0 parameter (an array of numbers, a metric, or a vector of metrics):
- count(arg0: scalar[]): scalar
- count(arg0: timeseries_vector): scalar
integrate
Returns an integrated sum of values or 0 for an empty timeseries.
The integrate function has the following function overloading options depending on the type of the input arg0 parameter (an array of numbers, a metric, or a vector of metrics):
- integrate(arg0: scalar[]): scalar
- integrate(arg0: timeseries_vector): scalar
iqr
Returns the interquartile range
The iqr function has the following function overloading options depending on the type of the input arg0 parameter (an array of numbers, a metric, or a vector of metrics):
- iqr(arg0: scalar[]): scalar
- iqr(arg0: timeseries_vector): scalar
last
Returns the last value different from NaN
or NaN
for an empty timeseries.
The last function has the following function overloading options depending on the type of the input arg0 parameter (an array of numbers, a metric, or a vector of metrics):
- last(arg0: scalar[]): scalar
- last(arg0: timeseries_vector): scalar
max
Returns the maximum value (or NaN
for an empty timeseries).
The max function has the following function overloading options depending on the type of the input arg0 parameter (an array of numbers, a metric, or a vector of metrics):
- max(arg0: scalar[]): scalar
- max(arg0: timeseries_vector): scalar
median
Returns the median of values (or NaN
for an empty timeseries).
The median function has the following function overloading options depending on the type of the input arg0 parameter (an array of numbers, a metric, or a vector of metrics):
- median(arg0: scalar[]): scalar
- median(arg0: timeseries_vector): scalar
min
Returns the minimum value (or NaN
for an empty timeseries).
The min function has the following function overloading options depending on the type of the input arg0 parameter (an array of numbers, a metric, or a vector of metrics):
- min(arg0: scalar[]): scalar
- min(arg0: timeseries_vector): scalar
percentile
Returns the percentile value for a set of values. The percentile level is set in the required level parameter as a number between 0 and 100.
The percentile function has the following function overloading options depending on the type of the input values parameter (an array of numbers, a metric, or a vector of metrics):
- percentile(level: scalar, values: scalar[]): scalar
- percentile(level: scalar, values: timeseries_vector): scalar
random
Returns a random item from a set of values.
The random function has the following function overloading options depending on the type of the input arg0 parameter (an array of numbers, a metric, or a vector of metrics):
- random(arg0: scalar[]): scalar
- random(arg0: timeseries_vector): scalar
std
Returns an unbiased estimation of standard deviation for a set of values (or NaN
for an empty timeseries). The calculation is made by the following formula:
Where:
- : A value from the vector of values (or points in a timeseries).
- : Average value.
- : Number of values.
The std function has the following function overloading options depending on the type of the input arg0 parameter (an array of numbers, a metric, or a vector of metrics):
- std(arg0: scalar[]): scalar
- std(arg0: timeseries_vector): scalar
sum
Returns a sum of all values of a set (or 0 for an empty timeseries).
The sum function has the following function overloading options depending on the type of the input arg0 parameter (an array of numbers, a metric, or a vector of metrics):
- sum(arg0: scalar[]): scalar
- sum(arg0: timeseries_vector): scalar
Combine
The combine functions aggregate a metric vector into a single metric or a metric vector.
histogram_avg
histogram_avg([bucketLabel: string], source: timeseries_vector): timeseries_vector
Calculates the average value of the distribution set by the histogram. The optional bucketLabel parameter specifies which label contains the values of histogram intervals.
histogram_cdfp
The histogram_cdfp function has the following use cases (function overloading) depending on the type of from and to input parameters (a number or an array of numbers):
- histogram_cdfp(from: number, to: number, bucketLabel: string**, source: timeseries_vector): timeseries_vector
- histogram_cdfp([from: number, to: number[], bucketLabel: string], source: timeseries_vector): timeseries_vector
- histogram_cdfp([from: number[], to: number, bucketLabel: string], source: timeseries_vector): timeseries_vector
- histogram_cdfp([from: number[], to: number[], bucketLabel: string], source: timeseries_vector): timeseries_vector
Calculates the percentage of values in the histogram between the intervals specified in the from and to optional parameters. If no parameters are specified, the first and last intervals are used, respectively. The optional bucketLabel parameter specifies which label contains the values of histogram intervals.
histogram_count
The histogram_count function has the following use cases (function overloading) depending on the type of from and to input parameters (a number or an array of numbers):
- histogram_count([from: number, to: number, bucketLabel: string], source: timeseries_vector): timeseries_vector
- histogram_count([from: number, to: number[], bucketLabel: string], source: timeseries_vector): timeseries_vector
- histogram_count([from: number[], to: number, bucketLabel: string], source: timeseries_vector): timeseries_vector
- histogram_count([from: number[], to: number[], bucketLabel: string], source: timeseries_vector): timeseries_vector
Counts the number of values in the histogram between the intervals specified in the from and to optional parameters. If no parameters are specified, the first and last intervals are used, respectively. The optional bucketLabel parameter specifies which label contains the values of histogram intervals.
histogram_percentile
The histogram_percentile function has the following use cases (function overloading) depending on the type of from and to input parameters (a number or an array of numbers):
- histogram_percentile(percentileLevel: number, [bucketLabel: string], source: timeseries_vector): timeseries_vector
- histogram_percentile(percentileLevel: number[], [bucketLabel: string], source: timeseries_vector): timeseries_vector
Calculates the percentile values of the distribution set by the histogram. The percentile level is set in the required percentileLevel parameter as a single number or an array of numbers from 0 to 100. The optional bucketLabel parameter specifies which label contains the values of histogram intervals.
histogram_sum
histogram_sum([bucketLabel: string], source: timeseries_vector): timeseries_vector
Calculates the sum of histogram values. The optional bucketLabel parameter specifies which label contains the values of histogram intervals.
series_avg
The series_avg function has the following use cases (function overloading) depending on the type of key input parameter (a string or an array of strings):
- series_avg([key: string], source: timeseries_vector): timeseries_vector
- series_avg([key: string[]], source: timeseries_vector): timeseries_vector
Aggregates timeseries into one (or multiple ones) by applying the avg (average) aggregation function for each time point. The optional key parameter contains a string or an array of strings with a list of labels to group by.
For example, the series_avg({...})
query calculates the average value among all uploaded metrics at each point.
The series_avg("host", {...})
query calculates the average value among all uploaded metrics for each value of the host
label.
The series_avg(["host", "disk"], {...})
query calculates the average value among all uploaded metrics for each combination of host
and disk
label values.
series_max
The series_max function has the following use cases (function overloading) depending on the type of key input parameter (a string or an array of strings):
- series_max([key: string], source: timeseries_vector): timeseries_vector
- series_max([key: string[]], source: timeseries_vector): timeseries_vector
Aggregates timeseries into one (or multiple ones) by applying the max aggregation function for each time point. The optional key parameter contains a string or an array of strings with a list of labels to group by. See examples of queries using the key parameter in series_avg.
series_min
The series_min function has the following use cases (function overloading) depending on the type of key input parameter (a string or an array of strings):
- series_min([key: string], source: timeseries_vector): timeseries_vector
- series_min([key: string[]], source: timeseries_vector): timeseries_vector
Aggregates timeseries into one (or multiple ones) by applying the min aggregation function for each time point. The optional key parameter contains a string or an array of strings with a list of labels to group by. See examples of queries using the key parameter in series_avg.
series_percentile
The series_percentile function has the following use cases (function overloading) depending on the type of rank input parameter (a number or an array of numbers):
- series_percentile(rank: number, source: timeseries_vector): timeseries_vector
- series_percentile(rank: number[], source: timeseries_vector): timeseries_vector
Aggregates timeseries into one (or multiple ones) by applying the percentile aggregation function for each time point.
series_sum
The series_sum function has the following use cases (function overloading) depending on the type of key input parameter (a string or an array of strings):
- series_sum([key: string], source: timeseries_vector): timeseries_vector
- series_sum([key: string[]], source: timeseries_vector): timeseries_vector
Aggregates timeseries into one (or multiple ones) by applying the sum aggregation function for each time point. The optional key parameter contains a string or an array of strings with a list of labels to group by. See examples of queries using the key parameter in series_avg.
Rank
The rank functions order a metric vector by the aggregation function value in the current time window and return some of the first (upper) or last (lower) timeseries from it. The limit parameter specifies how many metrics a function returns.
bottom_avg
bottom_avg(limit: number, source: timeseries_vector): timeseries_vector
Returns the limit of metrics with a minimum average value.
bottom_count
bottom_count(limit: number, source: timeseries_vector): timeseries_vector
Returns the limit of metrics with a minimum number of values.
bottom_last
bottom_last(limit: number, source: timeseries_vector): timeseries_vector
Returns the limit of metrics with the minimum last value.
bottom_max
bottom_max(limit: number, source: timeseries_vector): timeseries_vector
Returns the limit of metrics with the lowest max value.
bottom_min
bottom_min(limit: number, source: timeseries_vector): timeseries_vector
Returns the limit of metrics with the lowest min value.
bottom_sum
bottom_sum(limit: number, source: timeseries_vector): timeseries_vector
Returns the limit of metrics with the lowest sum value.
top_avg
top_avg(limit: number, source: timeseries_vector): timeseries_vector
Returns the limit of metrics with the top average value.
top_count
top_count(limit: number, source: timeseries_vector): timeseries_vector
Returns the limit of metrics with a maximum number of values.
top_last
top_last(limit: number, source: timeseries_vector): timeseries_vector
Returns the limit of metrics with the top last value.
top_max
top_max(limit: number, source: timeseries_vector): timeseries_vector
Returns the limit of metrics with the top max value.
top_min
top_min(limit: number, source: timeseries_vector): timeseries_vector
Returns the limit of metrics with the top minimum value.
top_sum
top_sum(limit: number, source: timeseries_vector): timeseries_vector
Returns the limit of metrics with the top sum value.
Transform
The transform metric functions calculate a new value in each point for each timeseries from a set of metrics.
abs
abs(source: timeseries_vector): timeseries_vector
Calculates the absolute value.
asap
asap(source: timeseries_vector): timeseries_vector
Smooths timeseries based on the ASAP algorithm
Timeseries points are averaged using a moving average with a dynamic window. The window width is automatically selected so as to remove as much noise as possible while retaining important information.
ceil
ceil(source: timeseries_vector): timeseries_vector
Rounds the point values up to the nearest integer.
derivative
derivative(source: timeseries_vector): timeseries_vector
Calculates the derivative: the difference between the values of neighboring points divided by the interval between them.
diff
diff(source: timeseries_vector): timeseries_vector
Calculates the difference between the values of each pair of neighboring points.
drop_above
drop_above(source: timeseries_vector, threshold: number): timeseries_vector
Drops points with a value above the threshold (not including the value itself). In dropped points, the metric value is equal to NaN
.
drop_below
drop_below(source: timeseries_vector, threshold: number): timeseries_vector
Drops points with a value above the threshold (not including the value itself). In dropped points, the metric value is equal to NaN
.
drop_nan
drop_nan(source: timeseries_vector): timeseries_vector
Drops points with the NaN
value.
exp
Calculates an exponential function: raises e to a power equal to the value of points, where e=2.718281... is the base of the natural logarithm.
floor
floor(source: timeseries_vector): timeseries_vector
Rounds point values down to the nearest integer.
fract
fract(source: timeseries_vector): timeseries_vector
Selects the real part of point values.
heaviside
heaviside(source: timeseries_vector): timeseries_vector
Calculates the Heaviside step function
integral
integral(source: timeseries_vector): timeseries_vector
Calculates an indefinite integral using the trapezoidal rule
log
log(source: timeseries_vector): timeseries_vector
Calculates the natural logarithm.
moving_avg
moving_avg(source: timeseries_vector, window: duration): timeseries_vector
Calculates the moving average across a window window width.
For example, the moving_avg({...}, 1d)
query returns the moving average with a window of 1 day.
moving_percentile
moving_percentile(source: timeseries_vector, window: duration, rank: number): timeseries_vector
Calculates the moving percentile: the percentile of the rank level (from 0 to 100) among the points in a window with a window width.
For example, the moving_percentile({...}, 1h, 99.9)
query returns the moving 99.9 percentile with a window of 1 hour.
moving_sum
moving_sum(source: timeseries_vector, window: duration): timeseries_vector
Calculates the moving sum across a window window width.
For example, the moving_sum({...}, 1d)
query will return a moving sum with a window of 1 day.
non_negative_derivative
non_negative_derivative(source: timeseries_vector): timeseries_vector
Calculates the derivative: the difference between the values of neighboring points divided by the interval between them. If the derivative value is negative, it's substituted with the NaN
value.
pow
pow(source: timeseries_vector, power: number): timeseries_vector
Calculates the power function: raises the point value to the power power.
ramp
ramp(source: timeseries_vector): timeseries_vector
Resets points with a negative value to 0.
replace_nan
replace_nan(source: timeseries_vector, replace: number): timeseries_vector
Replaces points with the NaN
value with the replace
value.
round
round(source: timeseries_vector): timeseries_vector
Rounds values to the nearest integer.
shift
shift(source: timeseries_vector, window: duration): timeseries_vector
Adds the window
value to point timestamps. This function lets you compare current metric values with the values for a different time interval.
For example, shift({...}, 1w)
returns metrics shifted a week ahead, meaning that the chosen time window will contain values that are week old.
sign
sign(source: timeseries_vector): timeseries_vector
Calculates the sgn(x) function. The function is 1 for positive point values, 0 for zero values, and -1 for negative values.
sqrt
sqrt(source: timeseries_vector): timeseries_vector
Calculates the square root of point values.
trunc
trunc(source: timeseries_vector): timeseries_vector
Truncates the real part of point values.
Other
alias
alias(source: timeseries_vector, arg1: string): timeseries_vector
Renames metrics. As an argument, you can use mustache templates{{label}}
format to substitute a label value in a new metric name.
constant_line
Returns a constant line consisting of two points in the beginning and end of the interval equal to value
constant_line(value: scalar): timeseries_vector
When you specify an optional grid parameter, the function populates the current time interval with points with the value of value and the step of grid between the points.
constant_line(value: scalar, grid: duration): timeseries_vector
Warning
Use the constant_line function only to show lines on charts. The use of this function in calculations will produce an incorrect result, because the function returns a timeseries of only two points: at the beginning and end of the definition interval.
drop_empty_series
drop_empty_series(source: timeseries_vector): timeseries_vector
Drops time series where either there are no points in the specified time range or all points have the NaN
value.