new

Yandex Monium

An observability platform for quickly getting answers about the status of your systems at any time and in any environment — in Yandex Cloud, a third-party cloud provider, or local infrastructure.

A single platform for observability

An all-in-one intuitive and flexible interface with the tools you need to monitor your systems. Easy connection and access to versatile telemetry in one window: switch simply from metric to trace, and from trace to log and in between them.

Scalability

Used by 16,000 Yandex employees in Yandex, the Monium platform has proven its effectiveness under high loads, with per second processing of 2.6 billion samples, 60 GB of logs, and 25 million spans.

Reliability

Monium is deployed on Yandex Cloud infrastructure, with multiple availability zones ensuring a fault-tolerant configuration for the platform.

Yandex Monium: A secure observability platform

Multi-level protection

Cutting-edge methods of authentication, authorization, and recording actions with records: TLS encryption of supply channels, the Yandex IAM user access management and control service, we support Workload Identity Federation for exchanging tokens of any system.

Compliance with standards

Yandex Cloud services take the requirements of international and national standards ISO, GDPR, PCI DSS, and GOST R 57580 into account. The platform complies with all the requirements of Russian Federal Law No. 152-FZ and provides Level 1 personal data protection (UZ-1).

Platform architecture

Full screen image

Advantages of the Monium observability platform

Early problem detection (MTTD) and reduced incident resolution time (MTTR)

Monium provides a single window for working with versatile telemetry: logs, traces, metrics. This makes it possible to search for the root causes of incidents significantly faster.

Functionalities:

  • Real-time access to data: events and metrics become available for query within seconds after reaching the system

  • The maximum amount of data recorded and unlimited storage period of metrics

  • Telegram alerts, SMS, calls, in-app pushes, email, and any other systems via integration with Cloud Functions

  • Escalations (notification chains): if an alert is not answered, Monium automatically notifies the next person on the list — with repeats and calls — taking duty schedules, time zones, and business hours into account.

Convenient monitoring of system availability and performance

Monium provides a single interface with the tools you need to assess and monitor your systems and find bottlenecks.

Functionalities:

  • Support for OpenTelemetry, Prometheus®, SEL (Yandex Monium query language) formats

  • PromQL query languages, SEL (Yandex Monium query language)

  • Infrastructure monitoring: k8s (Monium is compatible with OpenTelemetry), virtual machines, containers, network, disks

  • Visualization tools: Grafana, DataLens, custom visualization dashboards

  • Service map: instant understanding of systems' architecture, connections, and states

  • A pre-configured dashboard for analyzing reliability and performance metrics (SLO), automatic error budget calculation (Error Budget).

Optimized engineering team resources

Redistribute team resources to focused development tasks by reducing time spent on investigations and finding causes.

Functionalities:

  • An intuitive and flexible UI: quick search, filtering options, custom labels

  • Tools for working with IaC (coming soon)

  • Routine automation and increasing process maturity: integrations with CI/CD, corporate messenger, Jira, and other DevOps ecosystem tools.

Reduced TCO for observability solutions

Tool and license consolidation. Set up unified monitoring of your systems in any loop: Yandex Cloud, a third-party cloud provider, or local infrastructure.

Functionalities:

  • No infrastructure costs (servers, data warehouses, cluster deployment and administration)

  • Reduced operating costs (OpEx)

  • Data volume optimization using technologies and algorithms that can significantly reduce data storage costs.

If you don’t know where to start, contact a Yandex Cloud expert

We can help you to choose the right project architecture, calculate implementation costs, and tell you how to deploy the solutions you need. Our expert advice is free of charge.

Quick start

Support

Our specialists will be glad to answer any questions about using the platform. In addition, you can ask questions in the Yandex Monium community, where other users share their expertise.

Free education

We offer survey courses on working with services and programs to help deepen your knowledge and competencies related to the tech stack. A Yandex Monitoring course is currently available, in which we explain how to optimize system performance and detect problems with web servers and databases in a timely manner.

A broad partner network

Our partners help implement turnkey solutions of varying complexity.

FAQ

Root Cause Analysis (RCA) is a method of identifying the root cause of an incident to prevent it from recurring instead of eliminating its symptoms.