Data model in Monitoring
Yandex Monitoring stores data as time series
Metrics
A metric is a time series that shows a change of some value over time. For example, the resource status of a Yandex Cloud service: the amount of used disk space, network data transfer rate, and so on.
Metrics are identified using text labels.
Query aggregation
Some metrics, e.g., disk.write_latency
in Yandex Compute Cloud, track massive numbers of queries, at times reaching tens of thousands per second. In metrics like these, queries are initially aggregated into buckets depending on their values.
Such metrics have multiple buckets, e.g., 1
, 2
, 5
, 10
, etc. Thus, bucket 1
stores queries that took up to 1 ms to complete, bucket 2
up to 2 ms, bucket 5
up to 5 ms, etc.
When executing a query, the service measures its completion time and decides which bucket to put it into. For example, a query completed in 7 ms will go into bucket 10
, same as all other queries that took from 5 to 10 ms to complete.
Such metrics have a fractional number for value: the average number of queries per a unit of time, e.g., 5 seconds.
Metrics like that are usually analyzed using the histogram_percentile
filter, which takes for parameter the percentage share of queries for which to calculate the minimum time it takes to complete this share of queries.
Here is an example:
There were 1,000 queries, of which:
- 500 queries were completed in 0.5 ms.
- 499 queries were completed in 1.5 ms.
- One query was completed in 1,000 ms.
The arithmetic mean per query is around 2 ms. However, this value will be of little use due to the large peak value counted in. It would be much more useful to know that maximum query execution time was 1,000 ms, but 99% of queries were completed within 2 ms, i.e., the 99th percentile of the queries was 2 ms. You can get this percentile by providing 99
to the histogram_percentile
filter.
Labels
A label is a metric characteristic in key: "value"
format. Each metric is identified by an unordered set of labels. Use a parameter that takes a limited set of values as a label. For example, the HTTP status code, the types of procedures being performed in a database, and so on.
There are required and optional labels. Required labels:
cloudId
: ID of the cloud the resource resides in.folderId
: ID of the folder the resource resides in.service
: Indicates the Yandex Cloud service the resource belongs to, e.g.,compute
ormanaged-postgresql
.
Warning
When uploading custom metrics, you should write the custom
value into the service
label.
Label constraints and their values
The following constraints apply to labels and their values:
- A metric can have a maximum of 16 labels, including the required
cloudId
,folderId
, andservice
. - Label name must not be empty.
- Label name must not consist of the
-
character. - Name length: 32 characters or less.
- Label name must start with an uppercase or lowecase letter and may contain letters, digits,
.
, and_
. - Label names and metric values must not contain any non-Latin letters.
Metric types
Yandex Monitoring offers the following metric types:
Type | Description |
---|---|
DGAUGE |
Numeric value (decimal). It shows the metric value at a certain point in time. For example, the amount of used RAM. |
IGAUGE |
Numeric value (integer). It shows the metric value at a certain point in time. |
COUNTER |
Counter. It shows the metric value that increases over time. For example, the number of days of service continuous running. |
RATE |
Derivative value. It shows the change in the metric value over time. For example, the number of requests per second. |
Queries
A query is an arbitrary expression in the query language that results in a line or a set of lines. Query text may refer to the results of higher-level queries as variables.
Monitoring allows you to create queries to select a set of metrics and display them on a chart. You can also use templates as label values.
The following templates are available in Monitoring:
Syntax | Description |
---|---|
label="*" |
Outputs all metrics with the specified label. For example, the host="*" query will return all metrics with the host label. |
label="glob" |
Returns all metrics whose label value complies with a glob expression* : Any number of characters (including none). For example, name="folder*" will return all metrics whose name label value begins with the folder prefix.? : Any single character. For example, name="metric?" will return all labels whose value contains one character after metric .| : All specified options. For example, name="metric1|metric2" will return two metrics labeled metric1 and metric2 . |