Yandex Monitoring metric reference
Written by
Updated at November 21, 2024
This section describes Data Transfer metrics delivered to Monitoring.
The name of the metric is written to the name
label.
All Data Transfer metrics have common labels:
Label | Value |
---|---|
service | Service ID: data-transfer |
job_index | Worker index to distinguish workers used for parallel data copying. |
src_id | Source ID |
target_type | Target type, e.g., mongo |
resource_id | Data Transfer transfer ID |
dst_id | Target ID |
source_type | Source type, e.g., mongo |
operation_type (except for the replication.* metrics) |
Operation type, e.g., Activate |
CPU metrics
Processor core workload.
Common labels for all CPU metrics:
Label | Value |
---|---|
component | System component, e.g., psutil |
NameType, units | Description |
---|---|
cpu.counts COUNTER , number |
Number of CPUs allocated for data transfer in the worker |
proc.cpu DGAUGE , % |
Processor core workload. |
proc.descriptors DGAUGE , number |
Number of open file descriptors |
proc.ram DGAUGE , bytes |
RAM usage |
RAM metrics
Common labels for all RAM metrics:
Label | Value |
---|---|
component | System component, e.g., psutil |
NameType, units | Description |
---|---|
mem.available COUNTER , bytes |
RAM usage, available usage type. |
mem.percentage DGAUGE , % |
Percentage of RAM usage |
mem.used COUNTER , bytes |
RAM usage, used usage type. |
Service metrics
NameType, units | DescriptionLabels |
---|---|
fallbacks.source.deepness DGAUGE , number |
Current number of elements in the fallback queue on the source side |
fallbacks.source.errors COUNTER , number |
Number of errors on the source side during the fallback process |
fallbacks.source.items COUNTER , number |
Total number of elements added to the fallback queue on the source side |
fallbacks.target.deepness DGAUGE , number |
Current number of elements in the fallback queue on the target side |
fallbacks.target.errors COUNTER , number |
Number of errors on the target side during the fallback process |
fallbacks.target.items COUNTER , number |
Total number of elements added to the fallback queue on the target side |
logger.bytes_written COUNTER , bytes |
Total volume of recorded logs |
logger.field_truncated_size_hist IGAUGE , bytes |
Histogram of truncated fields in logs.bin label: Histogram buckets. |
logger.leaked_count COUNTER , number |
Number of leaks in logs |
logger.leaked_size COUNTER , bytes |
Volume of leaks in logs |
logger.leaked_size_hist IGAUGE , bytes |
Histogram of leaks in logs.bin label: Histogram buckets. |
logger.success_size_hist IGAUGE , bytes |
Histogram of recorded logs.bin label: Histogram buckets. |
middleware.error_tracker.failures COUNTER , number |
Number of errors in the error tracker.component label: System component, e.g., middleware_filter . |
middleware.error_tracker.success COUNTER , number |
Number of successful operations in the error tracker.component label: System component, e.g., middleware_filter . |
middleware.filter.dropped COUNTER , number |
Number of discarded elements.component label: System component, e.g., middleware_filter . |
publisher.consumer.active DGAUGE , number |
Number of active consumers (data processing threads) |
publisher.consumer.compress_ratio DGAUGE , % |
Data compression ratio during data transfer |
publisher.consumer.ddl_error COUNTER , number |
Number of errors related to DDL (Data Definition Language) operations |
publisher.consumer.error COUNTER , number |
Total number of errors encountered by consumers |
publisher.consumer.extracted_bytes DGAUGE , bytes |
Amount of data extracted from the source |
publisher.consumer.fatal COUNTER , number |
Number of fatal errors requiring agent involvement |
publisher.consumer.log_usage_bytes DGAUGE , bytes |
Size of buffer or write ahead log (when supported) in the source |
publisher.consumer.read_bytes DGAUGE , bytes |
Amount of read data |
publisher.data.bytes COUNTER , bytes |
Amount of data read from the source |
publisher.data.changeitems COUNTER , number |
Number of source events generated for a transfer (apart from the data to transfer, these events may include housekeeping operations) |
publisher.data.parsed_rows COUNTER , number |
Number of rows successfully processed after they were parsed |
publisher.data.transactions COUNTER , number |
Number of data transactions processed during transfer |
publisher.data.unparsed_rows COUNTER , number |
Number of data rows that could not be successfully parsed |
publisher.time.delay_ms DGAUGE , ms |
Delay during data transfer |
publisher.time.parse_ms DGAUGE , ms |
Time spent on data parsing |
publisher.time.push_ms DGAUGE , ms |
Time spent sending data to target |
publisher.time.transform_ms DGAUGE , ms |
Time spent on data transformation |
replication.running DGAUGE , 0/1 |
Current replication state.Takes these values:
|
replication.start.unix DGAUGE , number |
Replication start time label in Unix epoch format |
runtime.alloc COUNTER , bytes |
Total amount of memory allocated but not yet released.component label: System component, e.g., psutil . |
runtime.heapIdle COUNTER , bytes |
Amount of memory allocated for dynamic memory but not currently in use.component label: System component, e.g., psutil . |
runtime.heapInuse COUNTER , bytes |
Amount of memory actively used as dynamic memory.component label: System component, e.g., psutil . |
runtime.numGC COUNTER , number |
Number of garbage collection (GC) cycles performed since the start of measurement time.component label: System component, e.g., psutil . |
runtime.sys COUNTER , bytes |
Total amount of system memory in use.component label: System component, e.g., psutil . |
runtime.totalAlloc COUNTER , bytes |
Total amount of memory allocated throughout the run time.component label: System component, e.g., psutil . |
sinker.pusher.data.changeitems COUNTER , number |
Number of events written to the target (apart from the data to transfer, these events may include housekeeping operations) |
sinker.pusher.data.row_events_pushed COUNTER , number |
Number of rows sent to target |
sinker.pusher.time.batch_push_distribution_sec IGAUGE , seconds |
Full time it takes to write a batch to the target, including data preprocessing.bin label: Histogram buckets. |
sinker.pusher.time.row_lag_sec IGAUGE , seconds |
Time lag between records appearing at target and source.bin label: Histogram buckets. |
sinker.pusher.time.row_max_lag_sec DGAUGE , seconds |
Maximum data lag |
sinker.pusher.time.row_max_read_lag_sec DGAUGE , seconds |
Maximum lag between data appearing at source till the read time |
sinker.table.deleted_rows COUNTER , number |
Number of deleted table rows.table label: DB table or collection. |
sinker.table.error COUNTER , number |
Number of errors that occurred while processing the table.table label: DB table or collection. |
sinker.table.rows COUNTER , number |
50 tables with the maximum number of rows written to the target.table label: DB table or collection. |
sinker.table.updated_rows COUNTER , number |
Number of updated table rows.table label: DB table or collection. |
sinker.table.upserted_rows COUNTER , number |
Number of table rows inserted or updated (upsert ).table label: DB table or collection. |
sinker.time.bulkPrepare DGAUGE , seconds |
Time to prepare a data batch for writing |
sinker.time.bulkWrite DGAUGE , seconds |
Time to write a data batch |
sinker.time.push DGAUGE , seconds |
Total data write operation time |
sinker.transactions.inflight COUNTER , number |
Number of active transactions |
sinker.transactions.total COUNTER , number |
Total number of completed transactions |
storage.diff_perc DGAUGE , % |
Percentage difference between the number of source and target records.table label: DB table or collection. |
storage.source_rows DGAUGE , number |
Number of data source rows.table label: DB table or collection. |
storage.target_rows DGAUGE , number |
Number of data target rows.table label: DB table or collection. |
task.snapshot.remainder.table DGAUGE , number |
Number of rows awaiting transfer.table label: DB table or collection. |
task.snapshot.reminder.total DGAUGE , number |
Total number of remaining rows to transfer.table label: DB table or collection. |
task.status DGAUGE , 0/1 |
Status of operation in progress.Takes these values:
|