Backups in Yandex MPP Analytics for PostgreSQL
Yandex MPP Analytics for PostgreSQL supports automatic and manual database backups.
Yandex MPP Analytics for PostgreSQL enables you to restore your cluster to a specific recovery point where the data is consistent. This feature is known as Point-in-Time-Recovery (PITR). Recovery points are created every hour. When you specify a time to recover data for, the service will use the backup closest to that point in time. When recovering the selected cluster backup, records from write-ahead logs (WALs) will be added to the backup data up to the closest recovery point.
For example, if the backup was created on November 10, 2022, 12:00:00 UTC, the current date is November 15, 2022, 19:00:00 UTC, and the latest recovery point was saved on November 15, 2022, 18:00:00 UTC, the cluster can be restored to any recovery point between November 10, 2022, 12:00:01 UTC and November 15, 2022, 18:00:00 UTC, inclusive. If you specify November 15, 2022, 17:30:00 UTC as the recovery time, the cluster will be restored to the recovery point saved on November 15, 2022, 17:00:00 UTC.
PITR mode is enabled by default. It supports automatic backups only.
For clusters running an unsupported DBMS version, restoring from backups is not available.
To restore a cluster from a backup, follow this guide. You can also restore your cluster to move its hosts to a different availability zone.
Creating a backup
The first and every second automatic backup, as well as all manually created backups are full backups of all databases. To save space, other backups are incremental and only store the data that has changed since the previous backup.
A backup is automatically created once a day. You cannot disable automatic backups. However, for such backups, you can specify a time interval during which the backup will start when you create or update a cluster. The default value is 22:00 - 23:00 UTC (Coordinated Universal Time).
After a backup is created, it is compressed for storage. Append-optimized tables use data deduplication technology: newly added data or old data last archived more than 30 days ago is copied. The backup size does not include the deduplicated part size, so the displayed value can be significantly smaller than the data size in the cluster.
Backups are only created on running clusters. If you are not using your Greenplum® cluster 24/7, check the settings of backup start time.
Learn about creating manual backups in Managing backups.
Storing a backup
Storing backups in Yandex MPP Analytics for PostgreSQL:
-
Backups are stored in object storage as binary files and are encrypted using GPG
. Each cluster has its own encryption keys. -
The total backup size is a whole of data copy and WAL sizes. The backup size does not include the amount of deduplicated append-optimized table data. The size of WAL data depends on the amount of made changes and is comparable to the backup size. You can get both values with a list of backups.
-
Automatic backups are stored for seven days. Manual backups are stored until the user deletes them manually.
-
After you delete a cluster, all its backups are kept for seven days.
-
Quotas
and limits for cluster storage do not apply to backup storage.
Testing recovery from a backup
To test how backup works, restore a cluster from a backup and check your data for integrity.
Use cases
Greenplum® and Greenplum Database® are registered trademarks or trademarks of Broadcom Inc. in the United States and/or other countries.