Managed Service for PostgreSQL performance analysis and tuning
- Getting started
- Diagnosing inefficient queries
- Troubleshooting inefficient queries
- Diagnosing resource shortages
- Troubleshooting resource shortages
- Diagnosing database locks
- Troubleshooting locking issues
- Diagnosing connection errors
- Troubleshooting connection issues
- Diagnosing storage space shortages
- Troubleshooting storage space shortages
Managed Service for PostgreSQL cluster performance decline most often stems from one of the following causes:
- Inefficient PostgreSQL queries.
- High CPU, disk I/O, and network usage.
- Locks.
- Exhaustion of available connections.
- Insufficient storage space.
The following are tips for diagnosing and resolving these issues.
To detect potential issues in a cluster, use comprehensive health analysis tools.
Getting started
-
Select target databases for diagnostic analysis.
-
Enable the Access from the management console cluster setting.
-
Enable the
auto_explainmodule for extended query plan logging. -
To capture more queries in the performance log, lower the
log_min_duration_statementthreshold in the DBMS settings.Warning
With
log_min_duration_statementset to0, the system will log all queries, regardless of their duration. This can lead to the rapid exhaustion of free storage space.
Diagnosing inefficient queries
You can identify problematic queries using the following two methods:
-
Query the PostgreSQL system table
pg_stat_activity:SELECT NOW() - query_start AS duration, query, state FROM pg_stat_activity WHERE state != 'idle' ORDER BY 1 DESC;You will see a list of queries running on the server. Look for the queries with a high
durationvalue.To learn more about this output, see this PostgreSQL article
. -
Retrieve and analyze query statistics using the Managed Service for PostgreSQL built-in diagnostics tool.
Troubleshooting inefficient queries
There are several ways to optimize problematic queries:
-
Analyze query plans using
EXPLAIN .Look for queries that do not use indexes, indicated by a high row count in
Seq Scannodes. These queries increase both I/O load, due to more disk reads, and CPU usage, as processing a large number of rows requires more processor time.Create
or update the necessary indexes.Tip
To see query plans, navigate to the SQL tab on the cluster management page.
For more information, see SQL queries in Yandex WebSQL.
-
Automatically log query plans using the
auto_explain module. -
Update statistics using
ANALYZE .The DBMS builds query plans using the collected statistics. Frequent data updates cause these statistics to become outdated rapidly. Use
ANALYZEto force the DBMS to reanalyze a table or the entire database.ANALYZE <table_or_database_name>;If needed, increase
default_statistics_targetin the DBMS settings, then runANALYZEagain.To learn more about
default_statistics_target, see PostgreSQL settings. -
Create extended statistics objects.
PostgreSQL does not collect statistics on data correlation between columns in the same table. This is due to the potentially vast number of column combinations. If some columns are correlated, create extended statistics objects
. This allows the planner to optimize queries using information about data correlation between columns. -
Analyze query plan statistics in PostgreSQL logs.
The PostgreSQL
auto_explainmodule logs query plan statistics, which you can later review using a text search through log entries. For more information, see this PostgreSQL guide .
If you can neither optimize troublesome queries nor eliminate them, you have to upgrade the host class.
Diagnosing resource shortages
A resource shortage, a likely cause of cluster performance degradation, can be identified by checking metrics like CPU, disk I/O, and network usage on cluster monitoring charts. A resource usage chart showing steady growth followed by a plateau indicates that the workload has reached its limit or is exceeding the guaranteed service level.
In most cases, high CPU and disk I/O usage are caused by inefficient indexes or excessive host workload. Review the session and query data collected by the performance diagnostics tool.
Troubleshooting resource shortages
Try to optimize resource-heavy queries. If the load remains high after optimization, you have to upgrade the host class.
Diagnosing database locks
Cluster performance degradation can be caused by locks that occur when multiple processes attempt to access the same database resource, e.g., table or row, concurrently.
To detect locks using the performance diagnostics tool, do the following:
-
In the management console
, select a folder. -
Navigate to the Managed Service for PostgreSQL service.
-
Click the name of your cluster and select the Performance diagnostics tab.
-
In the Sessions field of the Slice tab, select WAIT_EVENT_TYPE.
- Review the Lock chart showing the number of queries locked during the selected time frame.
- For detailed information on queries that ran during the selected time period, switch to the Queries tab.
To learn what statistics you can get, see the PostgreSQL guides
.
To identify locks using PostgreSQL built-in tools, run the following query:
SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa
ON pl.pid = psa.pid;
Learn more about retrieving queries involved in locks in this PostgreSQL article
Troubleshooting locking issues
Try to optimize troublesome queries. If the load remains high after optimization, you have to upgrade the host class.
Diagnosing connection errors
The number of connections is limited by the max_connections setting, calculated as:
200 × <vCPU_share_on_host>: 15
Here, <vCPU_share_on_host> is the product of the number of vCPUs and their CPU guarantee, and 15 is the number of connections reserved for maintenance tasks. The resulting connection pool is distributed across the database roles.
If the number of open connections reaches the limit, the system logs the following errors:
- Too many connections for role.
- Server conn crashed.
- Invalid server parameter.
- Query wait timeout.
To review available connection usage details via monitoring tools:
-
In the management console
, select a folder. -
Navigate to the Managed Service for PostgreSQL service.
-
Click the name of your cluster and select the Monitoring tab.
-
Review the Total pooler connections chart.
Managed Service for PostgreSQL does not allow direct DBMS connections. All connections are made via the connection pooler.
-
The Clients metric shows the number of client connections to the connection pooler.
-
The Servers metric shows the number of connections between the DBMS and the connection pooler.
Pay attention to the connection count. High values indicate that some queries are keeping connections open for too long.
-
Troubleshooting connection issues
To address connection limit issues:
-
Increase the
max_connectionsvalue in the DBMS settings. -
Optimize queries to eliminate long-running transactions.
If the load remains high after optimization, you have to upgrade the host class.
Diagnosing storage space shortages
Poor cluster performance accompanied by the ERROR: cannot execute INSERT in a read-only transaction error in logs may indicate that storage space has been depleted, causing the cluster to switch to read-only mode.
To check available cluster storage space:
-
In the management console
, select a folder. -
Navigate to the Managed Service for PostgreSQL service.
-
Click the name of your cluster and select the Monitoring tab.
-
Review the Disk capacity in primary, [bytes] chart.
Pay attention to the Used metric showing the cluster storage usage.
Troubleshooting storage space shortages
For troubleshooting recommendations, see Managing disk space.