Yandex Monitoring metric reference
This section describes Managed Service for YDB metrics delivered to Monitoring.
The name label contains the metric name.
Database metrics
| Metric name Type, units |
Description Labels |
|---|---|
database_sizeDGAUGE, bytes |
Database size |
Resource usage metrics
| Metric name Type, units |
Description Labels |
|---|---|
resources.storage.limit_bytesIGAUGE, bytes |
Limit on the size of user and system data a database can store in a distributed network storage. |
resources.storage.topic.used_bytesDGAUGE, bytes |
Topic storage size in use |
resources.storage.used_bytesIGAUGE, bytes |
Size of user and system data stored in the distributed network storage. System data includes primary and secondary index |
resources.stream.limit_shardsDGAUGE, count |
Limit on the number of shards per stream |
resources.stream.storage.limit_bytesDGAUGE, bytes |
Stream storage size limit |
resources.stream.storage.reserved_bytesDGAUGE, bytes |
Reserved stream storage size |
resources.stream.throughput.limit_bytes_per_secondDGAUGE, bytes per second |
Stream throughput limit |
resources.stream.used_shardsDGAUGE, count |
Number of shards used by the stream |
API metrics
| Metric name Type, units |
Description Labels |
|---|---|
api.grpc.request.bytesRATE, bytes |
Size of requests received by the database over a certain period of time. Labels:
|
api.grpc.request.countRATE, count |
Total DB requests. Labels:
|
api.grpc.request.dropped_countRATE, count |
Number of requests dropped at the transport (gRPC) layer due to an error. Labels:
|
api.grpc.request.inflight_bytesIGAUGE, bytes |
Size of requests concurrently handled by the database over a certain period of time. Labels:
|
api.grpc.request.inflight_countIGAUGE, count |
Number of requests concurrently handled by the database over a certain period of time. Labels:
|
api.grpc.response.bytesRATE, bytes |
Size of responses sent by the database over a certain period of time. Labels:
|
api.grpc.response.countRATE, count |
Number of responses sent by the database over a certain period of time. Labels:
|
api.grpc.response.dropped_countRATE, count |
Number of responses dropped at the transport (gRPC) layer due to an error. Labels:
|
api.grpc.response.issuesRATE, count |
Number of specific error types encountered in gRPC API responses over a specified period of time. Labels:
|
api.request.completed_per_secondDGAUGE, requests per second |
API request completion rate |
api.request.latency_millisecondsIGAUGE, milliseconds |
API latency |
api.request.latency_milliseconds_countCOUNTER, request count |
Total API requests with measured latency |
api.request.latency_milliseconds_sumCOUNTER, milliseconds |
Overall API latency |
api.request.size_bytes_per_secondDGAUGE, bytes per second |
API request processing rate |
api.response.size_bytes_per_secondDGAUGE, bytes per second |
API response processing rate |
api.units.consumed_by_method_per_secondDGAUGE, units per second |
Rate of resource consumption by a specific API method |
api.units.consumed_per_secondDGAUGE, units per second |
Overall rate of resource consumption by API methods |
Session metrics
| Metric name Type, units |
Description Labels |
|---|---|
table.session.active_countIGAUGE, count |
Number of active client sessions |
table.session.closed_by_idle_countRATE, count |
Number of sessions closed by the DB server within a specific time period due to exceeding the idle session timeout |
Transaction processing metrics
You can analyze transaction duration using a histogram counter. The intervals are set in milliseconds. The chart shows the number of transactions binned by duration ranges.
| Metric name Type, units |
Description Labels |
|---|---|
table.transaction.client_duration_millisecondsHIST_RATE, count |
Number of client-side transactions of a certain duration. The duration is the client-side wait time between sending individual requests within a single transaction. It does not include the time it takes for the server to process those requests. Labels:
|
table.transaction.server_duration_millisecondsHIST_RATE, count |
Number of server-side transaction of a certain duration. The duration is the time it takes for the server to process requests within a transaction. It does not include the client-side wait time between sending individual requests within a single transaction. Labels:
|
table.transaction.total_duration_millisecondsHIST_RATE, count |
Number of transactions of a certain duration on both the server and the client. The duration of a transaction is the time interval from the transaction start (explicit or implicit) to either commit or rollback. It includes the server-side transaction processing time and the client-side wait time between sending different requests within a single transaction. Labels:
|
Query processing metrics
| Metric name Type, units |
Description Labels |
|---|---|
table.query.compilation.latency_millisecondsHIST_RATE, count |
Histogram counter. The intervals are set in milliseconds. It shows the number of successful table query compilations binned by latency ranges. |
table.query.compilation.active_countIGAUGE, count |
Number of active compilations. |
table.query.compilation.countRATE, count |
Number of compilations completed successfully over a certain time period. |
table.query.compilation.error_countRATE, count |
Number of compilations that failed over a certain time period. |
table.query.compilation.cache_hitsRATE, count |
Number of queries over a certain time period that required no compilation due to a pre-existing plan in the compilation cache. |
table.query.compilation.cache_missesRATE, count |
Number of queries over a certain time period that required a compilation. |
table.query.execution.latency_millisecondsHIST_RATE, count |
Histogram counter. The intervals are set in milliseconds. Shows the number of queries binned by execution time ranges. |
table.query.request.bytesRATE, bytes |
Size of YQL query strings and parameter values for queries that entered the database over a certain period of time. |
table.query.request.parameters_bytesRATE, bytes |
Size of parameters for queries that entered the database database over a certain period of time. |
table.query.response.bytesRATE, bytes |
Size of responses sent by the database over a certain period of time. |
Table partition metrics (DataShards)
| Metric name Type, unit |
Description Labels |
|---|---|
table.datashard.bulk_upsert.bytesRATE, bytes |
Size of data added through the BulkUpsert gRPC API call to all partitions of all DB tables over a certain period of time |
table.datashard.bulk_upsert.rowsRATE, count |
Number of rows added through the BulkUpsert gRPC API call to all partitions of all DB tables over a certain period of time |
table.datashard.erase.bytesRATE, bytes |
Size of data deleted from the database over a certain period of time |
table.datashard.erase.rowsRATE, count |
Number of rows deleted from the database over a certain period of time |
table.datashard.read.bytesRATE, bytes |
Size of data read by all partitions of all DB tables over a certain period of time |
table.datashard.read.rowsRATE, count |
Number of rows read by all partitions of all DB tables over a certain period of time |
table.datashard.row_countGAUGE, count |
Number of rows in DB tables |
table.datashard.scan.bytesRATE, bytes |
Size of data read through the StreamExecuteScanQuery or StreamReadTable gRPC API calls by all partitions of all DB tables over a certain period of time |
table.datashard.scan.rowsRATE, count |
Number of rows read through the StreamExecuteScanQuery or StreamReadTable gRPC API calls by all partitions of all DB tables over a certain period of time |
table.datashard.size_bytesGAUGE, bytes |
Size of data in DB tables |
table.datashard.used_core_percentsHIST_GAUGE, % |
Histogram counter. The intervals are set as a percentage. It shows the number of table partitions binned by computing resource usage percentage. |
table.datashard.write.rowsRATE, count |
Number of rows written by all partitions of all DB tables over a certain period of time |
table.datashard.write.bytesRATE, bytes |
Size of data written by all partitions of all DB tables over a certain period of time |
Table partition metrics (ColumnShards)
| Metric name Type, unit |
Description Labels |
|---|---|
table.columnshard.bulk_upsert.bytesRATE, bytes per second |
Rate of adding data to all partitions of all DB tables through the BulkUpsert gRPC API call |
table.columnshard.bulk_upsert.rowsRATE, rows per second |
Rate of adding rows to all partitions of all DB tables through the BulkUpsert gRPC API call |
table.columnshard.scan.bytesRATE, bytes per second |
Rate of data reads by all partitions of all DB tables through the StreamExecuteScanQuery or StreamReadTable gRPC API call |
table.columnshard.scan.rowsRATE, rows per second |
Rate of row reads by all partitions of all DB tables through the StreamExecuteScanQuery or StreamReadTable gRPC API call |
table.columnshard.write.bytesRATE, bytes per second |
Rate of data writes to all partitions of all DB tables |
table.columnshard.write.rowsRATE, rows per second |
Rate of row writes to all partitions of all DB tables |
Resource usage metrics (for dedicated mode only)
| Metric name Type Units |
Description Labels |
|---|---|
resources.cpu.limit_core_percentsIGAUGE, % |
Percentage of CPU available to a database. For example, for a database that has three nodes with four cores in pool=user per node, the value of this metric will be 1200.Labels:
|
resources.cpu.used_core_percentsRATE, % |
CPU usage. If the value is 100, one of the cores is being 100% used. The value may be greater than 100 for multi-core configurations.Labels:
|
resources.memory.limit_bytesIGAUGE, bytes |
RAM available to database nodes |
resources.memory.used_bytesIGAUGE, bytes |
RAM used by database nodes |
Query processing metrics (for dedicated mode only)
| Metric name Type Units |
Description Labels |
|---|---|
table.query.compilation.cache_evictionsRATE, count |
Number of queries evicted from the compilation |
table.query.compilation.cache_size_bytesIGAUGE, bytes |
Compilation cache size. |
table.query.compilation.cached_query_countIGAUGE, count |
Compilation cache size. |