Storage in Managed Service for ClickHouse®
Managed Service for ClickHouse® allows you to use network and local storage drives for database clusters. Network drives are based on network blocks, i.e., virtual disks in the Yandex Cloud infrastructure. Local drives are physically located on the database host servers.
When creating a cluster, you can select the following disk types for data storage:
-
Network HDDs (
network-hdd): Most cost-effective option for clusters with low requirements for read and write performance. -
Network SSDs (
network-ssd): Compromise solution: slower than local SSDs, network SSDs ensure data integrity in the event of Yandex Cloud hardware failure. -
Non-replicated SSDs (
network-ssd-nonreplicated): Network disks with higher performance achieved by eliminating redundancy.You can only expand this type of storage in 93 GB increments.
-
Ultra high-speed network SSDs with three replicas (
network-ssd-io-m3): Network disks that deliver performance equivalent to non-replicated SSDs while ensuring redundancy.You can only increase the size of these disks in 93 GB increments.
-
Local SSDs (
local-ssd): Highest-performing disks.You can expand such a storage as follows:
- For Intel Broadwell and Intel Cascade Lake: Only in 100 GB increments.
- For Intel Ice Lake and AMD Zen 4: Only in 368 GB increments.
For clusters with hosts residing in the
ru-central1-davailability zone, local SSD storage is not available if using Intel Cascade Lake.
Block
Note
Up to 5% of disk space is reserved for system use, so the disks may have less available space than indicated when creating a cluster.
For more information about sizes and performance of different disk types, see the Yandex Compute Cloud documentation.
Hybrid storage features
If you enable the Hybrid storage setting when creating or updating a cluster, you will be able to distribute data between the cluster storage and object storage in Yandex Object Storage. Thus your data will be placed in either a cluster or object storage, depending on the storage policy you specify. For example, you can choose to store your frequently used (hot) data in the cluster storage and rarely used (cold) data in the less expensive and slower object storage.
Warning
Hybrid storage is only available for MergeTree
The object storage uses a service bucket with unlimited storage capacity. Its storage class is standard, and you cannot change it. The Object Storage limits apply to the object storage.
To start using hybrid storage:
-
Create a cluster of the appropriate type. You do not need to configure the object storage.
-
Add databases and tables to the cluster. If the default storage policy does not work for certain tables, set the appropriate policies for these tables:
-
To set a policy when creating a table, specify the
storage_policysetting:CREATE TABLE table_with_non_default_policy ( <table_schema> ) ENGINE = MergeTree ... SETTINGS storage_policy = '<storage_policy_type>'; -
To create or update the policy for an existing table, run this query:
ALTER TABLE table_with_non_default_policy MODIFY SETTING storage_policy = '<storage_policy_type>';
-
See our example in Using hybrid storage.
To monitor the amount of space MergeTreech_s3_disk_parts_size metric in Yandex Monitoring. It is only available for Managed Service for ClickHouse® clusters with hybrid storage.
Storing cold data and its backups in hybrid storage counts towards the total cluster usage cost.
Available storage policies
Note
You cannot create new storage policies or update the existing ones.
A Managed Service for ClickHouse® cluster with enabled hybrid storage supports the following storage policies:
-
default: Cluster automatically manages data placement depending on the following:- Hybrid storage settings.
- Table TTL
(time-to-live) settings.
If there is enough free space in the cluster storage, only the rows with the expired TTL are moved to the object storage. Thus you can move the data to the object storage in parts before your cluster storage is filled up.
You can configure moving rows with the expired TTL to the object storage and set the TTL value when creating a table or later on.
-
local: For a table with this policy, rows can only be placed in the cluster storage. No data is moved between the storages. -
object_storage: For a table with this policy, rows can only be placed in the object storage. No data is moved between the storages.
Storage policies do not affect merge operations
- Enable and disable the
prefer_not_to_mergesetting that merges stored data parts. This setting is available in the CLI and API. - Set any
max_data_part_size_bytesvalue for the maximum size of the data part you can get upon merging smaller ones.
However, you can change the behavior of these operations using the settings available in the cluster.
To view current policy settings, run this query:
SELECT *
FROM system.storage_policies;
For more information about storage policies and their settings, see this ClickHouse® guide
Hybrid storage settings
A Managed Service for ClickHouse® cluster with enabled hybrid storage has the following settings:
-
data_cache_enabled: Enables caching data accessed from the object storage in the cluster storage. The default value istrue(enabled).In this case, cold data accessed from the object storage is moved to high-speed drives for faster processing.
-
data_cache_max_size: Sets the maximum cache size, in bytes, allocated in the cluster storage for data accessed from the object storage. If no value is set, the maximum cache size defaults to half the size of the cluster storage. -
move_factor: Sets the minimum percentage of free space in the cluster storage. If your free space percentage is below this value, the data will be moved to Yandex Object Storage. The minimum value is0, the maximum value is1, and the default value is0.01.Data parts are queued up in descending order by size. The number of data parts that will be moved is determined by the
move_factorcondition. -
prefer_not_to_merge: Disables merging of data parts in cluster and object storages. Merges are enabled by default.Once inserted into the table, the data is saved as a data part and sorted by its primary key. Then, ClickHouse® runs a background merge of data parts belonging to the same partition into a larger data part within 10 to 15 minutes after the insertion. You can use the system.parts
system table to view the merged data parts and partitions.
You can specify hybrid storage settings when creating or updating a cluster.
For more information about setting up hybrid storage, see this ClickHouse® guide
Selecting the disk type when creating a cluster
The number of hosts you can create together with a ClickHouse® cluster depends on the selected disk type:
-
With local SSD storage (
local-ssd), you can create a cluster with two or more hosts.This cluster will be fault-tolerant.
Storage on local SSDs increases your cluster costs: you pay for the cluster even if it is stopped. For more information, see the pricing policy.
-
With non-replicated network SSD storage (
network-ssd-nonreplicated), you can create a cluster with three or more hosts.This cluster will be fault-tolerant.
-
You can add any number of hosts within the current quota when using the following disk types:
- Network HDDs (
network-hdd) - Network SSDs (
network-ssd) - Ultra high-speed network SSDs with three replicas (
network-ssd-io-m3)
- Network HDDs (
For more information about limits on the number of hosts per cluster, see Quotas and limits.
Disk encryption
When creating or restoring a cluster from a backup, you can encrypt the storage disk with a custom KMS key. To encrypt a disk of an already created cluster, disable encryption, or encrypt a disk with a different key, create a backup of the cluster and restore it with the new settings.
Warning
Encryption is not available for local disks (local-hdd and local-ssd).
To create an encrypted disk, you need the kms.keys.user role or higher.
If you deactivate the key used to encrypt a disk, access to the data will be suspended until you reactivate the key.
Alert
If you delete the key used to encrypt a disk or its version, you will irrevocably lose access to your data. For more information, see this Key Management Service article.
Managing disk space
In Managed Service for ClickHouse®, if your storage runs out of space, INSERT queries, background merges, and mutations are suspended. They will resume automatically after you expand the storage.
To monitor your storage utilization, set up alerts in Yandex Monitoring.
Automatic storage expansion
To prevent situations where the disk runs out of free space and insert queries, background merges, and mutations get suspended, set up automatic storage size increase for ClickHouse® and a coordination service, ClickHouse® Keeper or ZooKeeper. This will trigger storage increase when you reach a preset threshold, i.e., a percentage of the total storage size. There are two thresholds:
-
Scheduled expansion threshold: To schedule such an expansion, an algorithm analyzes data from the last few hours and estimates how quickly the storage is filling up. If the calculations show that the specified threshold will be exceeded by the start of the nearest maintenance window, the system schedules a storage expansion. When initiating maintenance, the system will check whether the threshold is indeed exceeded, and if so, expand the storage.
-
Immediate expansion threshold: When reached, the storage expands immediately.
You can use either one or both thresholds. If you set both, make sure the immediate expansion threshold is not lower than the scheduled one.
For a scheduled expansion, you need to set up a maintenance window schedule.
Upon reaching the threshold, the storage will be expanded depending on the disk type:
-
For network HDDs and SSDs, by the higher of the two values: 20 GB or 20% of the current disk size.
-
For non-replicated SSDs and ultra high-speed network SSDs with three replicas, by 93 GB.
-
For local SSDs:
- In an Intel Broadwell or Intel Cascade Lake cluster, by 100 GB.
- In an Intel Ice Lake cluster, by 368 GB.
If the threshold is reached again, the storage will be automatically expanded until it reaches the specified maximum. After that, you can set a new maximum storage size.
The automatic storage size increase settings for ClickHouse® apply to all existing shards. New shards will use the settings of the oldest shard.
You can configure automatic storage expansion when creating or updating a cluster.
Warning
- You cannot decrease the storage size.
- When scaling your storage, the cluster hosts will be unavailable.