Exporting Greenplum® data to cold storage in Yandex Object Storage
Yandex Managed Service for Greenplum® data is stored on cluster disks. With a Yandex Cloud Yezzey extension
Yezzey supports append-optimized (AO) and append-optimized column-oriented (AOCO) tables. For more information about the tables, see Data storage types and the Greenplum® documentation
Note
The functionality is supported for clusters with Greenplum® version 6.25 or higher. This functionality is at the Preview stage and is free of charge.
To transfer your data from Managed Service for Greenplum® cluster disks to a cold storage in Object Storage:
You can also transfer your data back to cluster disks.
If you no longer need the resources you created, delete them.
Getting started
-
Create a Managed Service for Greenplum® cluster. When creating a cluster, make sure to enable Hybrid storage.
Note
You cannot disable this option after you save your cluster settings.
-
Get an SSL certificate to connect to the Greenplum® database:
Linux (Bash)/macOS (Zsh)Windows (PowerShell)mkdir -p ~/.postgresql && \ wget "https://storage.yandexcloud.net/cloud-certs/CA.pem" \ --output-document ~/.postgresql/root.crt && \ chmod 0655 ~/.postgresql/root.crt
The certificate will be saved to the
~/.postgresql/root.crt
file.mkdir $HOME\.postgresql; curl.exe -o $HOME\.postgresql\root.crt https://storage.yandexcloud.net/cloud-certs/CA.pem
The certificate will be saved to the
$HOME\.postgresql\root.crt
file.To use graphical IDEs, save a certificate
to a local folder and specify the path to it in the connection settings.
Export the Greenplum® table to Object Storage
-
Connect to the cluster:
psql "host=c-<cluster_ID>.rw.mdb.yandexcloud.net \ port=6432 \ sslmode=verify-full \ dbname=postgres \ user=<username> \ target_session_attrs=read-write"
-
Create a database named
db_with_yezzey
:CREATE DATABASE db_with_yezzey;
The new DB will be granted permissions to install extensions. These permissions are not granted for the default DB.
-
Connect to the new DB:
\connect db_with_yezzey
-
Create a Yezzey extension:
CREATE EXTENSION yezzey;
-
Create an append-optimized table named
ao_table
:CREATE TABLE ao_table (a int) WITH (appendoptimized=true) DISTRIBUTED BY (a);
-
Populate the table with a series of integers from 1 through 10,000:
INSERT INTO ao_table SELECT * FROM GENERATE_SERIES(1, 10000);
-
Transfer the
ao_table
data to a cold storage in Object Storage:SELECT yezzey_define_offload_policy('ao_table');
Check the result
-
Check how much of the cluster local cache and Object Storage space is used by:
-
Your
ao_table
:SELECT * FROM yezzey_offload_relation_status('ao_table');
-
Each segment file in
ao_table
:SELECT * FROM yezzey_offload_relation_status_per_filesegment('ao_table');
If there are non-zero values in each
external_bytes
column in the command output, the table is transferred to a cold storage in Object Storage. -
-
Check which table segment files are now in Object Storage:
SELECT * FROM yezzey_relation_describe_external_storage_structure('ao_table');
-
Make sure the data can be read from Object Storage:
SELECT AVG(a) FROM ao_table;
Result:
avg ----------------------- 5000.5000000000000000 (1 row)
-
Make sure the data can be written to Object Storage:
-
Add a series of integers from 1 through 10,000 to your
ao_table
:INSERT INTO ao_table SELECT * FROM GENERATE_SERIES(1, 10000);
-
Check that the number of rows has doubled:
SELECT COUNT(1) FROM ao_table;
Result:
count ------- 20000 (1 row)
-
Transfer the table from Object Storage to cluster disks
To transfer your ao_table
from Object Storage back to the Managed Service for Greenplum® cluster disks, run the command below:
SELECT yezzey_load_relation('ao_table');
Delete the resources you created
If you no longer need the cluster you created, delete it.
Greenplum® and Greenplum Database® are registered trademarks or trademarks of VMware, Inc. in the United States and/or other countries.