Yandex Cloud
Search
Contact UsTry it for free
  • Customer Stories
  • Documentation
  • Blog
  • All Services
  • System Status
  • Marketplace
    • Featured
    • Infrastructure & Network
    • Data Platform
    • AI for business
    • Security
    • DevOps tools
    • Serverless
    • Monitoring & Resources
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Start testing with double trial credits
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Center for Technologies and Society
    • Yandex Cloud Partner program
    • Price calculator
    • Pricing plans
  • Customer Stories
  • Documentation
  • Blog
© 2026 Direct Cursus Technology L.L.C.
Yandex Managed Service for PostgreSQL
  • Getting started
    • All tutorials
    • Creating a PostgreSQL cluster for 1C
    • Creating a cluster of 1C:Enterprise Linux servers with a Managed Service for PostgreSQL cluster
    • Exporting a database to Yandex Data Processing
    • Searching for cluster performance issues
    • Performance analysis and tuning
    • Setting up a connection from a container in Serverless Containers
    • Delivering data to Yandex Managed Service for Apache Kafka® using Yandex Data Transfer
    • Delivering data to Yandex Managed Service for YDB using Yandex Data Transfer
    • Delivering data to Yandex Managed Service for Apache Kafka® using Debezium
    • PostgreSQL change data capture and delivery to YDS
    • Delivering data from Yandex Managed Service for Apache Kafka® using Yandex Data Transfer
    • Transferring data from Yandex Object Storage using Yandex Data Transfer
    • Configuring a fault-tolerant architecture in Yandex Cloud
    • Monitoring the status of geographically distributed devices
    • Writing load balancer logs to PostgreSQL
    • Creating an MLFlow server for logging experiments and artifacts
    • Working with data using Query
    • Federated data queries using Query
    • Fixing string sorting issues after upgrading glibc
    • Writing data from a device to a database
  • Access management
  • Pricing policy
  • Terraform reference
  • Monitoring metrics
  • Audit Trails events
  • Public materials
  • Release notes

In this article:

  • Check the cluster’s health using Yandex Monitoring
  • Perform cluster performance diagnostics
  • Review the cluster logs
  1. Tutorials
  2. Searching for cluster performance issues

Diagnosing Managed Service for PostgreSQL cluster performance issues

Written by
Yandex Cloud
Updated at March 25, 2026
  • Check the cluster’s health using Yandex Monitoring
  • Perform cluster performance diagnostics
  • Review the cluster logs

If you experience Managed Service for PostgreSQL cluster performance issues, run diagnostics. This will help you identify the root cause and prevent future problems.

To identify issues and their causes, you can use several cluster health monitoring tools. These tools should be used together, as issues are frequently caused by multiple anomalies occuring at once.

Suppose, for example, your Managed Service for PostgreSQL cluster experiences high CPU usage. To diagnose why this is happening:

  1. Check the cluster’s health using Yandex Monitoring.
  2. Perform cluster performance diagnostics.
  3. Review the cluster logs.

This will allow you to optimize the cluster performance and reduce the risk of the issue recurring.

Check the cluster’s health using Yandex MonitoringCheck the cluster’s health using Yandex Monitoring

  1. In the management console, select a folder.

  2. Navigate to the Managed Service for PostgreSQL service.

  3. Click the name of your cluster and select the Monitoring tab.

  4. To get started with Yandex Monitoring metrics, dashboards, or alerts, click Open in Monitoring in the top panel.

  5. On the Average CPU usage chart, identify the period where the line shows a steady increase followed by a plateau.

    This chart shows average CPU core utilization. The plateau indicates higher than normal CPU usage. To find the plateau, adjust the time window of the chart.

  6. Review other charts for corresponding spikes or plateaus during the same time period of high CPU usage.

    Check the following charts:

    • Session CPU usage cores: Shows CPU utilization broken down by session type. This allows you to identify the user session and database that caused high CPU usage.

      High CPU utilization is often caused by users executing long-running queries, such as analytical queries on unindexed tables. You can track such queries on the Age of oldest transaction/statement chart.

    • Age of oldest transaction/statement: Shows the duration of transactions and queries. A spike in this chart indicates that some queries were running longer than usual.

      Suppose a spike in the Age of oldest transaction/statement chart coincides with spikes or plateaus in the Session CPU usage cores and Average CPU usage charts. It shows that high CPU consumption was caused by long-running queries.

    1. Optionally, review other charts:

      • Average transaction/statement time: Shows the average execution time for transactions and queries. A rising trend in this chart may indicate that some queries are now running longer.

      • Log errors: Shows the frequency of errors generated across the cluster. If you see a spike in this chart, check the cluster logs.

      • OOM Count: Shows the number of Out-Of-Memory Killer events. OOM Killer mechanism terminates memory exhausting applications, preventing the OS from crashing. If OOM Count chart shows Out-Of-Memory Killer being invoked, check the cluster memory usage on the Maximum memory usage chart and take steps to free up memory.

      • PostgreSQL Alive, [boolean]: Shows whether PostgreSQL is available on each host in the cluster. If you cannot connect to a cluster host, check its availability using this chart.

      • Sessions per wait event: Number of waiting session broken down by wait event type. Common wait event types include row and table locks, as well as waits for CPU or memory to become available. A spike in this chart indicates that your cluster is running low on critical resources.

      • Sessions read bytes and Sessions write bytes: Show the volume of data read from and written to the database, broken down by session. If I/O volume reaches its limit, cluster performance may degrade.

      • Transactions/statements per second: Shows the number of queries and transactions per second. This chart helps diagnose high cluster load by showing whether it is caused by high query volume. If your cluster is running out of resources, you will see a dip in this chart.

      For a full list of charts, see PostgreSQL cluster and host state monitoring.

Perform cluster performance diagnosticsPerform cluster performance diagnostics

Once you have identified the time period of high CPU usage, locate the specific queries that caused it. Use the Managed Service for PostgreSQL cluster’s performance diagnostics tool and complete the following steps:

  1. Make sure the Statistics sampling option is enabled.

  2. On the cluster page, select Performance diagnostics in the left-hand panel.

  3. On the tab that opens, locate the segment where the chart shows a steady increase followed by a plateau. Its timeframe should align with the segments you identified in the Yandex Monitoring charts.

    To find the plateau, adjust the time window of the chart.

  4. On the Sessions tab, select WAIT_EVENT in the Slice field.

  5. To identify the wait events that caused the high CPU usage, hover over the plateau. A pop-up window will appear, showing all relevant wait events. The wait events with highest values are the primary contributors to the performance issue.

  6. In the list of queries below the chart, identify the queries associated with a high count of wait events.

    Suppose the chart shows a high number of CPU wait events. In this case, the offending queries will also show a high count of these events, as evident from the mini-charts next to each query in the list.

    image

You have now pinpointed the queries responsible for the high CPU usage.

Review the cluster logsReview the cluster logs

Suppose you see spikes in the Log errors chart in Yandex Monitoring. To find out what caused them, examine the cluster logs. A good practice is to review the cluster logs daily, not merely when problems arise. Doing this will keep you informed of the cluster’s current health.

To view the logs:

  1. On the cluster page, select Logs in the left-hand panel.

  2. Set the time range to match the period of elevated CPU consumption.

  3. In the ** Severity** checkbox, check ERROR, PANIC, and FATAL.

  4. Examine the list of errors that appears. These errors reveal system events that coincided with the CPU usage spike.

    The logs may contain multiple instances of the same error. For example, if a user query runs too long, you might see repeated log entries of the canceling statement due to user request error. This indicates that the database was unable to process the query within the expected timeframe, and therefore it was canceled.

  5. Set the log level to show LOG and WARNING messages.

  6. Review the event list for the period you selected.

    You will see all queries run against the database. With the log_min_duration_statement DBMS setting enabled, the logs will also include query execution time. This allows you to identify long-running queries.

    Warning

    If log_min_duration_statement is set to zero, the system logs all queries, regardless of their execution time. This can lead to the rapid exhaustion of the available storage space.

    The LOG and WARNING logs also include other information besides executed queries, such as records like: temporary file: path "<file_path>" size <file_size>. If this log entry appears multiple times, it means that the queries are not optimized, and their sorting and aggregation is consuming too much of the work_mem memory allocated for database operations, resulting in a shortage of memory for the cluster buffer.

This allows you to identify the timing and cause of the CPU usage spike, pinpoint the responsible users and queries, and correlate the relevant events. To prevent future recurrences of this problem in the Managed Service for PostgreSQL cluster, optimize its operation.

Was the article helpful?

Previous
Exporting a database to Yandex Data Processing
Next
Performance analysis and tuning
© 2026 Direct Cursus Technology L.L.C.