Yandex Cloud
Search
Contact UsGet started
  • Blog
  • Pricing
  • Documentation
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • ML & AI
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Customer Stories
    • Gateway to Russia
    • Cloud for Startups
    • Education and Science
  • Blog
  • Pricing
  • Documentation
Yandex project
© 2025 Yandex.Cloud LLC
Yandex Managed Service for Greenplum®
  • Getting started
    • Resource relationships
    • Host classes
    • Calculating the cluster configuration
    • Networking in Managed Service for Greenplum®
    • Quotas and limits
    • Backups
    • Resource groups
    • Sharding
    • Users and roles
    • User authentication
    • Command center
    • External tables
    • Managing connections
    • Expanding a cluster
    • Maintenance
    • Greenplum® settings
  • Access management
  • Pricing policy
  • Terraform reference
  • Monitoring metrics
  • Audit Trails events
  • Public materials
  • Release notes

In this article:

  • Non-routine maintenance operations
  • Maintenance window
  • Maintenance procedure
  • Routine maintenance operations
  • Custom table vacuuming
  • Statistics collection
  1. Concepts
  2. Maintenance

Maintenance in Managed Service for Greenplum®

Written by
Yandex Cloud
Updated at October 1, 2024
  • Non-routine maintenance operations
    • Maintenance window
    • Maintenance procedure
  • Routine maintenance operations
    • Custom table vacuuming
    • Statistics collection

There are two classes of maintenance operations in Managed Service for Greenplum®:

  • Non-routine cluster maintenance operations
  • Routine database maintenance operations

Non-routine maintenance operationsNon-routine maintenance operations

Non-routine maintenance operations involve cluster software updates and post-failure host recovery. They may result in changes to cluster settings and a cluster's restart. During these operations current queries will be aborted and incomplete transactions will be canceled.

Non-routine maintenance operations related to updates are performed in a specified order during a maintenance window. These operations include:

  • Installing minor Greenplum® updates. This results in DBMS restart.
  • Installing PXF updates. This results in PXF restart.
  • Restarting cluster hosts required for cloud infrastructure scheduled maintenance (replacing failed components, installing system updates, performing scheduled hardware maintenance, etc.).
  • Installing security updates on cluster hosts. This results in host restart.

Non-routine maintenance operations related to cluster recovery can be performed at any time as needed. These operations include:

  • Recovering data after a physical host or non-replicated disk fails in the cloud infrastructure.
  • Segment rebalancing: Resetting preferred segment roles after a host or its segments are restored.

Maintenance windowMaintenance window

You can set the preferred maintenance time when creating a cluster or updating its settings:

  • The arbitrary option (default) allows performing maintenance at any time.
  • The by schedule option allows setting the preferred maintenance start day and time (UTC). For example, you can choose a time when the cluster is least loaded.

Maintenance procedureMaintenance procedure

Maintenance related to software updates is performed as follows:

  1. Segment hosts undergo maintenance one by one. The hosts are queued randomly. If a segment host needs to be restarted during maintenance, it becomes unavailable while being restarted.
  2. Maintenance is performed on the STANDBY master host. If it needs to be restarted during maintenance, it becomes unavailable while being restarted.
  3. Maintenance is performed on the PRIMARY master host. If it is restarted during maintenance and becomes unavailable, the standby master host will take its role. If you access a cluster using the FQDN of the primary master host, the cluster may become unavailable. To make your application continuously available, access the cluster using a special FQDN always pointing to the primary master host.

Routine maintenance operationsRoutine maintenance operations

Routine maintenance operations are required to ensure proper database performance. They are run regularly on a certain schedule and do not abort current queries. These operations include:

  • Vacuuming (VACUUM) system folder tables. This operation is run three times a day.
  • Custom table vacuuming.
  • Statistics collection.
  • Backup.

Data redistribution during cluster expansion can be run as a background process while not being a routine maintenance operation. The process will be started after the vacuuming of tables, but before collecting the statistics.

Custom table vacuumingCustom table vacuuming

Custom tables are vacuumed daily. Databases are handled concurrently in two threads. In each database, tables on which VACUUM has not been run yet are handled first. Then the remaining tables are handled, starting with the one on which VACUUM has not been run the longest.

Two vacuuming modes are supported:

  • Sequential: Tables are handled one by one. The total operation execution time is limited with a soft timeout: when it is reached, the vacuuming of the current table is completed, and then the process is terminated.
  • Concurrent: Tables are handled in two threads. This mode uses a hard timeout: when it is reached, all vacuuming processes are forced to terminate.

The default mode is sequential. To switch to concurrent table vacuuming mode, contact technical support.

The start time and timeout of the VACUUM operation are set up when creating or updating a cluster.

Statistics collectionStatistics collection

Statistics collection (the ANALYZE operation) is performed after the vacuuming of tables (if background data redistribution is not in progress). Databases are handled concurrently in two threads. In addition, two threads are run to collect table statistics in each database. As a result, statistics can be collected in four threads.

The analyzedb utility is used to collect statistics. It runs the ANALYZE command for all append-optimized (AO) tables modified since the last time the utility collected the statistics, as well as for all heap tables without exception.

Statistics collection from each database is limited with a timeout which is specified in the settings when creating or updating a cluster. The total statistics collection time is not limited.

Greenplum® and Greenplum Database® are registered trademarks or trademarks of VMware, Inc. in the United States and/or other countries.

Was the article helpful?

Previous
Expanding a cluster
Next
Greenplum® settings
Yandex project
© 2025 Yandex.Cloud LLC