Disk types in Yandex MPP Analytics for PostgreSQL
Yandex MPP Analytics for PostgreSQL allows you to use network and local storage drives for database clusters. Network drives are based on network blocks, which are virtual disks in the Yandex Cloud infrastructure. Local disks are physically located on the cluster servers.
When creating a cluster, you can select the following disk types for data storage:
-
Network HDDs (
network-hdd): Most cost-effective option for clusters that do not require high read/write performance. -
Network SSDs (
network-ssd): Balanced solution. Such disks are slower than local SSD storage, but, unlike local disks, they ensure data integrity if Yandex Cloud hardware fails. -
Non-replicated SSDs (
network-ssd-nonreplicated): Network disks with enhanced performance achieved by eliminating redundancy.The storage size can only be increased in 93 GB increments.
-
Ultra high-speed network SSDs with three replicas (
network-ssd-io-m3): Network disks with the same performance characteristics as non-replicated ones. This disk type provides redundancy.Such disks can be increased in size only in 93 GB increments.
-
Local SSDs (
local-ssd): Disks with the best performance.The size of such a storage can be increased:
-
For Intel Cascade Lake: Only in 100 GB increments.
-
For Intel Ice Lake: In 368 GB increments only.
For clusters with hosts residing in the
ru-central1-davailability zone, local SSD storage is not available if using Intel Cascade Lake. -
Note
Up to 5% of disk space is reserved for system use, so the disks may have less available space than indicated when creating a cluster.
For more information about sizes and performance of different disk types, see the Yandex Compute Cloud documentation.
In a Greenplum® cluster, the type of disks for master hosts and segment hosts may differ.
Note
When using standard Intel Ice Lake hosts, access to local SSD storage is provided on request. If you need one, contact our support
Specifics of local SSD storage
Local SSDs do not provide fault-tolerant storage and impact the cost of the entire cluster: you are charged for a cluster with this type of storage even if it is stopped. You can find more information in the pricing policy.
Disk space management
If any host storage is more than 95% full, the cluster will switch to read-only mode automatically, all DBs get the DEFAULT_TRANSACTION_READ_ONLY = TRUE setting through the ALTER DATABASE query.
In this mode, the INSERT, DELETE, or UPDATE queries result in an error.
Monitoring the transition to read-only mode
To monitor storage usage on cluster hosts, configure alerts in Yandex Monitoring:
-
Navigate to the folder dashboard and select Monitoring.
-
Select Yandex MPP Analytics for PostgreSQL.
-
Create an alert with the following settings:
-
Metric: Configure the following metric settings:
-
Cloud
-
Folder
-
Yandex MPP Analytics for PostgreSQL service
-
Greenplum® cluster ID
You can get the cluster ID from the folder’s cluster list.
-
disk.free_byteslabel
-
-
Condition: Define the
Less than or equalscondition for disk usage percentage that will trigger the alert:- 95% of the storage size for
Alarm - 90% of the storage size for
Warning
- 95% of the storage size for
-
Advanced settings:
- Aggregation function:
Minimum(metric’s minimum value over the period). - Evaluation window: Preferred metric update period.
- Aggregation function:
-
Add the notification channel you created earlier.
-
Recovering a cluster from read-only mode
If the cluster switched to read-only mode:
-
Increase the storage capacity to exceed the threshold value. Yandex Cloud will then disable read-only mode automatically.
-
Disable read-only mode manually and free up storage space by deleting some data.
Alert
When doing so, make sure the amount of free disk space never reaches zero. Otherwise, with the fail-safe mechanism disabled, Greenplum® will crash, rendering the cluster inoperable.
To disable read-only mode manually, contact support
-
Connect to the database using any method of your choice.
-
Start a transaction and run the following statement within it:
SET LOCAL transaction_read_only TO off; -
In the same transaction, clean up the data you no longer need using the
DROPorTRUNCATEstatements. Avoid theDELETEstatement because it marks rows as deleted without physically purging them from the database. -
Commit the transaction and restart all database connections.
For example, to remove the
ExcessDataTable1table you no longer need, use the following transaction:BEGIN; SET LOCAL transaction_read_only TO off; DROP TABLE ExcessDataTable1; COMMIT;
Use cases
- Loading data from Yandex Object Storage to Yandex MPP Analytics for PostgreSQL using Yandex Data Transfer
- Exporting Greenplum® data to a cold storage in Yandex Object Storage
Greenplum® and Greenplum Database® are registered trademarks or trademarks of Broadcom Inc. in the United States and/or other countries.