Using the Greenplum® command center
Greenplum® Command Center offers the following features:
- Viewing information about sessions and queries.
- Viewing the resource consumption history for completed queries.
- Aborting the current session.
- Terminating the current query.
Check out these use cases for how and when you can use the Command Center.
For more information about the statistics you can get using the Command Center, see Greenplum® Command Center.
Note
The Greenplum® command center only allows you to perform basic operational analysis of sessions and queries. If your task requires in-depth strategic research and advanced analysis tools, use log export to Yandex Cloud Logging. Yandex Cloud Logging allows you to visualize logs in Grafana and process them using Data Streams and Query.
Viewing information about sessions and queries
You can view a list of sessions and queries with details on them. For each session, you can view its history and queries made within it. For each query, you can view its execution plan and a JSON file with details.
To view information about sessions and queries:
-
Navigate to the folder dashboard
and select Yandex MPP Analytics for PostgreSQL. -
Click the cluster name and navigate to the
Command center tab. -
Select what you want to view and navigate to relevant tab:
- Current state for the current sessions and queries.
- State history for sessions or queries at a given time point in the past.
-
Navigate to the Sessions or Queries section. In the State history tab, these are under the chart.
-
To filter a session or query list, click
Filters and select the relevant parameters. -
To view details for:
- Sessions: Click the session name.
- Queries: Click the key of the query you are running.
For session and query parameters, see Greenplum® Command Center parameters.
Viewing the resource consumption history for completed queries
The resource consumption history includes a variety of system metrics. These show how a Greenplum® cluster was consuming resources to process queries at different time points. You can also view a list of completed queries. Using this information, you can manage your cluster hosts' CPU and memory in such a way so as to process queries more effectively.
To view the resource consumption history for completed queries:
-
Navigate to the folder dashboard
and select Yandex MPP Analytics for PostgreSQL. -
Click the cluster name and navigate to
Command center → Usage history. -
Select the consumption metric you need:
- CPU Time: Time it took the CPU resources to process the queries, in seconds.
- Peak memory: Maximum memory the cluster used to process a query during its lifetime.
- Disk R: Memory used for data reads, in bytes.
- Disk W: Memory used for data writes to the DB, in bytes.
- Spill: Additional memory used for query execution.
- Total time: Total memory used for query processing, in bytes.
Once you select the consumption metric, you will see a chart with details and a list of queries. The chart will show the metric value, the user who ran the query, and the query execution time.
-
To filter the results, click
Filters and select the relevant parameters.
Aborting the current session
To free up resources for sessions, you can abort a session with the Idle status:
-
Navigate to the folder dashboard
and select Yandex MPP Analytics for PostgreSQL. -
Click the cluster name and navigate to the
Command centertab. -
In Current state → Sessions, click
in the relevant line and select Terminate session.If you see Terminate query, select it and stop the query.
-
Confirm stopping the session.
Terminating the current query
To free up resources for queries, you can terminate a query with the Idle status within an idle session. To do this:
- Navigate to the folder dashboard
and select Yandex MPP Analytics for PostgreSQL. - Click the cluster name and navigate to the
Command centertab. - In Current state → Queries, click
in the relevant line and select Terminate query. - Confirm terminating the query.
Current state analysis examples
The Greenplum® command center supports the following types of cluster current state analysis:
- Metric analysis, e.g., heavy session search or query execution structure analysis.
- Event analysis, e.g., idle session search or blocking session search.
Heavy session search
- Navigate to the folder dashboard
and select Yandex MPP Analytics for PostgreSQL. - Click the cluster name and navigate to
Command center → Current state. - Sort sessions by one of the following columns: CPU time, Peak memory, Spill, Disk W, Disk R, Net recv, or Net sent.
- Find the sessions that consume the selected resource the most.
- For each selected session:
- Click the session number. The session info page will open.
- Compare the computing load and I/O load metrics (CPU time, Peak memory, Spill, Disk W, Disk R, Net recv, Net sent) against the overall load charts for the cluster or its individual hosts.
- Find which session has contributed the most to resource consumption.
- For details about the session's states at different points in time and to track the metrics' evolution over time, go to the Session history tab.
Note
The query initiator is the only person who can find resource consumption issues within a given session, as the only one who knows the expected runtime of particular queries.
Query execution structure analysis
You can identify queries that are inefficient due to their SQL statement logic or the sequence of operations.
-
Navigate to the folder dashboard
and select Yandex MPP Analytics for PostgreSQL. -
Click the cluster name and navigate to
Command center → Current state. -
Navigate to the Sessions tab.
-
Enable displaying only active sessions by turning off all status buttons except Active.
-
Filter the sessions by Start time.
-
Find a long-lived session with high values in the CPU time and Peak memory columns. Click its number to open its info page.
-
Analyze the CPU time, Peak memory, Spill, Skew, Net sent, Net recv, and Interconnect retransmits values.
-
If you see high values for Net sent and Net recv, and a non-zero value for Interconnect retransmits:
- Navigate to the Queries tab.
- Apply the SSID: filter by specifying the transaction ID of the selected session.
- Sort queries by the Key of running query column in descending order.
- Open execution plans for multiple queries.
- If you see
GatherorGather MergeafterSort,Aggregate, orDistinct, moveGROUP BY/DISTINCT/ORDER BYto subqueries. - If you get broad selections with a full set of columns, limit the results with the help of
LIMITor pagination, plus select only the columns you need and apply filters at the early stages of your query.
-
If you see a non-zero value for Spill:
- Navigate to the Queries tab.
- Apply the SSID: filter by specifying the transaction ID of the selected session.
- Sort queries by the Key of running query column in descending order.
- Open execution plans for multiple queries.
- If you see subqueries with outer row dependency (
EXISTSorINwith correlation) and the execution plan contains theSubPlanorInitPlannodes, decorrelate such subqueries. - If you see sorting or materialization followed by
WindowAggover large selections, apply pre-aggregation or filtering, and exclude unnecessary columns before applying the window functions. - If you see
SortorDistinctat different nesting levels, reduce the number of such operations and their nesting depth.
-
If you see high CPU time or Peak memory values with non-zero Skew values:
- Navigate to the Queries tab.
- Apply the SSID: filter by specifying the transaction ID of the selected session.
- Sort queries by the Key of running query column in descending order.
- Open the execution plans for several queries and check how the joins are executed:
- If joins are based on columns different from the actual table distribution key, rewrite your query.
- If you are joining large sets but filters are either missing or applied after
JOIN, use filtering in subqueries beforeJOIN.
-
Idle session search
Let's assume the user is done working with the database but has left the session open. In which case the session will be idling while still consuming the cluster resources, thereby degrading its performance. To identify and terminate such a session, do the following:
- Navigate to the folder dashboard
and select Yandex MPP Analytics for PostgreSQL. - Click the cluster name and navigate to
Command center → Current state. - Filter the sessions by Start time.
- Find the longest-lasting session with the
Idlestatus. Click its number to open its info page. - Look up the Session info field under Query start time to see when the last query was submitted.
- If the session is not executing any requests, and the client application's logic does not require the connection to be retained, you can terminate such a session. Do it by clicking Terminate session in the top-right corner and confirm terminating the session.
Blocking session search
In some cases, the session acquires table rows or metadata for a long time. This can create a queue of blocked sessions awaiting the resource acquisition. To find which session is the blocking one:
- Navigate to the folder dashboard
and select Yandex MPP Analytics for PostgreSQL. - Click the cluster name and navigate to
Command center → Current state. - Navigate to the Sessions tab.
- To display the blocking tree, click
. - Explore the blocking tree and identify the main blocking sessions.
- For each blocking session, check the following:
- Status; usually
ActiveorIdle transaction. - Start time and Status changed values.
- Amount of consumed resources.
- Number of blocked sessions.
- Query text.
- Status; usually
- If a blocking session remains
Activefor a long time, all the while consuming the computing resources, a heavy query may be the cause. In which case you may want to optimize your queries and business logic. - If a session is blocking many other sessions while being
Idle transactionfor a long time, you can terminate it after additional checks:- Make sure CPU time is not increasing and the Reason for wait field is empty.
- In the top-right corner, click Terminate session.
- Confirm stopping the session.
Tip
To prevent long-term blocking:
- Optimize your queries and reduce the amount of data processed at any given time.
- Separate interactive queries and heavy operations on the timeline.
- Set query execution and blocking timeouts.
Examples of state history and consumption history analysis
The Greenplum® command center supports the following types of session and query history analysis:
- Metric analysis, e.g., search for heavy queries and search for high network load queries.
- Event analysis, e.g., search for canceled queries and execution errors.
Searching for CPU-intensive queries
Let's assume that during a certain period a higher than usual CPU consumption is reported. To determine which queries caused the spikes, do the following:
-
Find out when the spike occurred:
-
Navigate to the folder dashboard
and select Yandex MPP Analytics for PostgreSQL. -
Click the cluster name and navigate to
Command center → State history. -
Set the CPU usage filter.
-
Use the chart to find out when CPU consumption became abnormally high.
Hover over the highest point on the chart curve. You will see a pop-up displaying the cluster state details for the selected time point. The window will include the time point when the spike occurred.
-
-
Identify the CPU-intensive queries:
- Navigate to the Usage history tab.
- Set the time range based on the state history data.
- Group the queries by user, database, and query ID. This will group similar queries together.
- Filter the query groups by CPU time.
- Open the group with the highest CPU time value.
- Examine the SQL text of your queries and their execution plans to figure out the cause of high CPU consumption.
Tip
The query initiator is the only one who can find CPU consumption issues within a given session; however, look for the following signs indicating that you need optimization:
- Complex calculations and expressions executed line by line.
- Sorting and aggregation without data filtering.
- Multiple scans of large tables without using indexes or data distribution.
- Re-calculations of subqueries or functions inside expressions.
Searching for high network load queries
-
Find the approximate time period when network issues and errors were observed, for example:
- Cluster connection failure or slow response complaints.
- Network anomalies and errors based on cluster logs and monitoring data.
-
Establish the cause of the errors:
- Navigate to the folder dashboard
and select Yandex MPP Analytics for PostgreSQL. - Click the cluster name and navigate to
Command center → State history. - Set the time range during which errors were observed.
- In the drop-down list above the chart, select Connections and then Net usage. Compare the charts.
- If unusually large Net usage values were observed, abnormal network activity is the most likely cause.
- If unusually large Connections values were observed, a spike in connections is the most likely cause.
- Navigate to the folder dashboard
-
If the errors are caused by abnormal network activity:
- Navigate to the Usage history tab.
- Set the time range based on the state history data.
- Select Group by: → user.
- Filter the query groups by the Net sent and Net recv columns.
- Find the user with the highest values in these columns. Filter the query groups by this user.
- Select Group by: → Query ID and disable grouping by user.
- Find the queries that generated the most traffic. Save the text of these queries and their start time.
- Navigate to the Sessions tab.
- Use the query text-based search to find the queries of interest.
- Identify the source of traffic using the Application column.
- Based on your analysis, tweak the applications generating abnormal network activity:
- Limit the amount of upload data.
- Use materialized views or temporary tables.
- Optimize table distribution (DISTRIBUTED BY) and update table statistics prior to large inserts.
- Review the ETL pipeline architecture.
-
If the errors are caused by a spike in connections:
- Go to the Sessions tab below the chart.
- Filter the sessions by Start time.
- On the chart, select a time point with the highest Connections values. Use the At time: section and the < > arrows to set the exact time point.
- Use the User and Application name: filters to compare the number of new sessions per second for each source.
- If one source creates many more sessions than the others:
- Check whether the application is reusing its connections.
- Set the interval between retries and total number of attempts for reconnections.
- Optionally, edit the connection manager settings based on your analysis.
Searching for canceled queries and execution errors
If the user complains about long wait times and connection losses but no other users report the same issues, this may indicate execution errors or canceled queries.
To find which queries were canceled or caused execution errors, proceed as follows:
-
Navigate to the folder dashboard
and select Yandex MPP Analytics for PostgreSQL. -
Click the cluster name and navigate to
Command center → State history. -
Navigate to the Queries tab.
-
Select a time point when issues were reported by the monitoring data. Use the At time: section and the < > arrows to set the exact time point.
-
Filter the queries by the Query state column.
-
Find queries with the
CanceledorErrorstatus. -
Establish the query sources based on the User and Application columns. Optionally, use the User: and Application name: filters.
-
If one source generates significantly more canceled and failed queries than the others:
-
Check and, optionally, optimize business logic and the structure of your SQL queries.
Pay special attention to frequent full selections, lack of data filtering, redundant table joins, or nested subqueries.
-
Set the interval between reconnections and the total number of attempts.
-
-
Optionally, use the connection manager to limit the number of concurrent active connections and balance the load between clients.
The optimal parameters depend on the number of segments and cluster resources.
Greenplum® and Greenplum Database® are registered trademarks or trademarks of Broadcom Inc. in the United States and/or other countries.