Maintenance in Managed Service for Greenplum®
There are two classes of maintenance operations in Managed Service for Greenplum®:
Non-routine maintenance operations
Non-routine maintenance operations involve cluster software updates and post-failure host recovery. They may result in changes to cluster settings and a cluster's restart. During these operations current queries will be aborted and incomplete transactions will be canceled.
Non-routine maintenance operations related to updates are performed in a specified order during a maintenance window. These operations include:
- Installing minor Greenplum® updates. This results in DBMS restart.
- Installing PXF updates. This results in PXF restart.
- Restarting cluster hosts required for cloud infrastructure scheduled maintenance (replacing failed components, installing system updates, performing scheduled hardware maintenance, etc.).
- Installing security updates on cluster hosts. This results in host restart.
Non-routine maintenance operations related to cluster recovery can be performed at any time as needed. These operations include:
- Recovering data after a physical host or non-replicated disk fails in the cloud infrastructure.
- Segment rebalancing
: Resetting preferred segment roles after a host or its segments are restored.
Maintenance window
You can set the preferred maintenance time when creating a cluster or updating its settings:
- The arbitrary option (default) allows performing maintenance at any time.
- The by schedule option allows setting the preferred maintenance start day and time (UTC). For example, you can choose a time when the cluster is least loaded.
Maintenance procedure
Maintenance related to software updates is performed as follows:
- Segment hosts undergo maintenance one by one. The hosts are queued randomly. If a segment host needs to be restarted during maintenance, it becomes unavailable while being restarted.
- Maintenance is performed on the
STANDBY
master host. If it needs to be restarted during maintenance, it becomes unavailable while being restarted. - Maintenance is performed on the
PRIMARY
master host. If it is restarted during maintenance and becomes unavailable, the standby master host will take its role. If you access a cluster using the FQDN of the primary master host, the cluster may become unavailable. To make your application continuously available, access the cluster using a special FQDN always pointing to the primary master host.
Routine maintenance operations
Routine maintenance operations are required to ensure proper database performance. They are run regularly on a certain schedule and do not abort current queries. These operations include:
- Vacuuming (
VACUUM
) system folder tables. This operation is run three times a day. - Custom table vacuuming.
- Statistics collection.
- Backup.
Data redistribution during cluster expansion can be run as a background process while not being a routine maintenance operation. The process will be started after the vacuuming of tables, but before collecting the statistics.
Custom table vacuuming
Custom tables are vacuumed daily. Databases are handled concurrently in two threads. In each database, tables on which VACUUM has not been run yet are handled first. Then the remaining tables are handled, starting with the one on which VACUUM has not been run the longest.
Two vacuuming modes are supported:
- Sequential: Tables are handled one by one. The total operation execution time is limited with a soft timeout: when it is reached, the vacuuming of the current table is completed, and then the process is terminated.
- Concurrent: Tables are handled in two threads. This mode uses a hard timeout: when it is reached, all vacuuming processes are forced to terminate.
The default mode is sequential. To switch to concurrent table vacuuming mode, contact technical support
The start time and timeout of the VACUUM
operation are set up when creating or updating a cluster.
Statistics collection
Statistics collection (the ANALYZE
operation) is performed after the vacuuming of tables (if background data redistribution is not in progress). Databases are handled concurrently in two threads. In addition, two threads are run to collect table statistics in each database. As a result, statistics can be collected in four threads.
The analyzedbANALYZE
command for all append-optimized (AO) tables modified since the last time the utility collected the statistics, as well as for all heap tables without exception.
Statistics collection from each database is limited with a timeout which is specified in the settings when creating or updating a cluster. The total statistics collection time is not limited.
Greenplum® and Greenplum Database® are registered trademarks or trademarks of VMware, Inc. in the United States and/or other countries.