Managing shards in a ClickHouse® cluster
You can enable sharding for a cluster, as well as add and configure individual shards.
Enabling sharding
Managed Service for ClickHouse® clusters are created with a single shard. To start sharding data, add one or more shards and create a distributed table.
Creating a shard
The number of shards in Managed Service for ClickHouse® clusters is limited by the CPU and RAM quotas available to database clusters in your cloud. To review current resource usage, open the Quotas
You can create multiple shards in a cluster in one go.
-
In the management console
, select the folder the cluster is in. -
Navigate to the Managed Service for ClickHouse service.
-
Click the cluster name and navigate to the Shards tab.
-
Click Create shards.
-
Click
next to the new shard to update its parameters:- Name and weight.
- Configuration of shard hosts.
-
Optionally, click Add shard to add more shards and specify their parameters.
-
Optionally, click Add host to add more hosts and specify their parameters.
-
To copy the schema from a random replica of one of the shard to the hosts of the new shards, select Copy the data schema.
Warning
Use data schema copying only if the schema is the same on all the cluster shards.
-
Click Create shard.
If you do not have the Yandex Cloud CLI yet, install and initialize it.
The folder used by default is the one specified when creating the CLI profile. To change the default folder, use the yc config set folder-id <folder_ID> command. You can also set a different folder for any specific command using the --folder-name or --folder-id options.
To create one or multiple shards:
-
View the description of the CLI command for creating shards:
yc managed-clickhouse shards add --help -
Run the command to create shards.
Specify one or multiple
--shardparameters in the command, one for each new shard.Here is an example of the command for creating a single shard (it does not use all available parameters):
yc managed-clickhouse shards add \ --cluster-name=<cluster_name> \ --shard name=<new_shard_name>,` `weight=<shard_weight> \ --host zone-id=<availability_zone>,` `subnet-name=<subnet_name>,` `shard-name=<shard_name> \ --copy-schemaWhere:
-
--cluster-name: Cluster name.You can get the cluster name from the list of clusters in your folder.
-
--shard: Shard parameters:-
name: Shard name. It must be unique within the cluster.It may contain Latin letters, numbers, hyphens, and underscores. The name may be up to 63 characters long.
-
weight: Shard weight.
-
-
--host: Parameters of the host to add to the shard:-
zone-id: Availability zone. -
subnet-name: Subnet name. -
shard-name: Name of the shard to add the host to.
-
-
--copy-schema: Optional parameter that initiates copying of the data schema from a random replica of one of the shards to the hosts of the new shard.Warning
Use data schema copying only if the schema is the same on all the cluster shards.
-
-
Open the current Terraform configuration file describing your infrastructure.
For more on how to create this file, see Creating a cluster.
-
To the Managed Service for ClickHouse® cluster description, add one or more
shardsections andhostsections of theCLICKHOUSEtype withshard_namespecified:resource "yandex_mdb_clickhouse_cluster" "<cluster_name>" { ... shard { name = "<shard_name>" weight = <shard_weight> } host { type = "CLICKHOUSE" zone = "<availability_zone>" subnet_id = yandex_vpc_subnet.<subnet_in_availability_zone>.id shard_name = "<shard_name>" } } -
Optionally, to copy the schema from a random replica of one of the shards to the hosts of the new shards, add
copy_schema_on_new_hostsset totrueto the cluster description.Warning
Use data schema copying only if the schema is the same on all the cluster shards.
-
Make sure the settings are correct.
-
In the command line, navigate to the directory that contains the current Terraform configuration files defining the infrastructure.
-
Run this command:
terraform validateTerraform will show any errors found in your configuration files.
-
-
Confirm updating the resources.
-
Run this command to view the planned changes:
terraform planIf you described the configuration correctly, the terminal will display a list of the resources to update and their parameters. This is a verification step that does not apply changes to your resources.
-
If everything looks correct, apply the changes:
-
Run this command:
terraform apply -
Confirm updating the resources.
-
Wait for the operation to complete.
-
-
For more information, see this Terraform provider guide.
Timeouts
The Terraform provider sets the following timeouts for Managed Service for ClickHouse® cluster operations:
- Creating a cluster, including by restoring from a backup: 60 minutes.
- Updating a cluster: 90 minutes.
- Deleting a cluster: 30 minutes.
Operations exceeding the timeout are aborted.
How do I change these limits?
Add a timeouts section to the cluster description, e.g.:
resource "yandex_mdb_clickhouse_cluster" "<cluster_name>" {
...
timeouts {
create = "1h30m" # 1 hour 30 minutes
update = "2h" # 2 hours
delete = "30m" # 30 minutes
}
}
-
Get an IAM token for API authentication and put it into an environment variable:
export IAM_TOKEN="<IAM_token>" -
Call the Cluster.AddShards method, e.g., via the following cURL
request:-
Create a file named
body.jsonand paste the following code into it:{ "shardSpecs": [ { "name": "<shard_name>", "configSpec": { "clickhouse": { "resources": { "resourcePresetId": "<host_class>", "diskSize": "<storage_size_in_bytes>", "diskTypeId": "<disk_type>" }, "weight": "<shard_weight>" } } } ], "hostSpecs": [ { "zoneId": "<availability_zone>", "type": "CLICKHOUSE", "subnetId": "<subnet_ID>", "assignPublicIp": <public_access_to_host>, "shardName": "<shard_name>" } ], "copySchema": <copying_data_schema> }Where:
-
shardSpecs: Settings of shards to add to the cluster as an array of elements, one per shard. Each element has the following structure:-
name: Shard name. -
configSpec.clickhouse.resources: Host resources to add to the new shard:resourcePresetId: Host class ID. You can get the list of available host classes with their IDs using the ResourcePreset.List method.diskSize: Disk size, in bytes.diskTypeId: Disk type.
-
-
configSpec.clickhouse.weight: Shard weight.By default, each shard is assigned a weight of
1. If you assign a greater value to a single shard, data will be distributed across the shards according to their weights.To calculate the shard priority for data distribution, the system adds up the weights of all shards and then divides each shard's weight by the total. For example, if one shard has a weight of
1and another has a weight of3, then the first shard's priority is1/4and the second shard's priority is3/4. The higher the priority, the more data the shard will get.For more information, see this ClickHouse® guide
. -
hostSpecs: Settings of hosts to add to the shard. The settings appear as an array of elements, one per host. Each element has the following structure:zoneId: Availability zone.type: Host type. You can only addCLICKHOUSEhosts to your shards.subnetId: Subnet ID.assignPublicIp: Controls whether the host is accessible via a public IP address,trueorfalse.shardName: Shard name.
-
copySchema: Copying the data schema from a random replica of one of the shards to the hosts of the new shard. The possible values aretrueorfalse.Warning
Use data schema copying only if the schema is the same on all the cluster shards.
-
-
Run this query:
curl \ --request POST \ --header "Authorization: Bearer $IAM_TOKEN" \ --header "Content-Type: application/json" \ --url 'https://mdb.api.cloud.yandex.net/managed-clickhouse/v1/clusters/<cluster_ID>/shards:batchCreate' \ --data '@body.json'You can get the cluster ID with the list of clusters in the folder.
-
-
Check the server response to make sure your request was successful.
-
Get an IAM token for API authentication and put it into an environment variable:
export IAM_TOKEN="<IAM_token>" -
Clone the cloudapi
repository:cd ~/ && git clone --depth=1 https://github.com/yandex-cloud/cloudapiBelow, we assume that the repository contents reside in the
~/cloudapi/directory. -
Call the ClusterService.AddShards method, e.g., via the following gRPCurl
request:-
Create a file named
body.jsonand paste the following code into it:{ "cluster_id": "<cluster_ID>", "shard_specs": [ { "name": "<shard_name>", "config_spec": { "clickhouse": { "resources": { "resource_preset_id": "<host_class>", "disk_size": "<storage_size_in_bytes>", "disk_type_id": "<disk_type>" }, "weight": "<shard_weight>" } } } ], "host_specs": [ { "zone_id": "<availability_zone>", "type": "CLICKHOUSE", "subnet_id": "<subnet_ID>", "assign_public_ip": <public_access_to_host>, "shard_name": "<shard_name>" } ], "copy_schema": <copying_data_schema> }Where:
-
cluster_id: Cluster ID. You can get it from the list of clusters in the folder. -
shard_specs: Settings of shards to add to the cluster as an array of elements, one per shard. Each element has the following structure:-
name: Shard name. -
config_spec.clickhouse.resources: Host resources to add to the new shard:resource_preset_id: Host class ID. You can get the list of available host classes with their IDs using the ResourcePresetService.List method.disk_size: Disk size, in bytes.disk_type_id: Disk type.
-
config_spec.clickhouse.weight: Shard weight.By default, each shard is assigned a weight of
1. If you assign a greater value to a single shard, data will be distributed across the shards according to their weights.To calculate the shard priority for data distribution, the system adds up the weights of all shards and then divides each shard's weight by the total. For example, if one shard has a weight of
1and another has a weight of3, then the first shard's priority is1/4and the second shard's priority is3/4. The higher the priority, the more data the shard will get.For more information, see this ClickHouse® article
.
-
-
host_specs: Settings of hosts to add to the shard as an array of elements, one per host. Each element has the following structure:zone_id: Availability zone.type: Host type. You can only addCLICKHOUSEhosts to your shards.subnet_id: Subnet ID.assign_public_ip: Controls whether the host is accessible via a public IP address,trueorfalse.shard_name: Shard name.
-
copy_schema: Copying the data schema from a random replica of one of the shards to the hosts of the new shard. The possible values aretrueorfalse.Warning
Use data schema copying only if the schema is the same on all the cluster shards.
-
-
Run this query:
grpcurl \ -format json \ -import-path ~/cloudapi/ \ -import-path ~/cloudapi/third_party/googleapis/ \ -proto ~/cloudapi/yandex/cloud/mdb/clickhouse/v1/cluster_service.proto \ -rpc-header "Authorization: Bearer $IAM_TOKEN" \ -d @ \ mdb.api.cloud.yandex.net:443 \ yandex.cloud.mdb.clickhouse.v1.ClusterService.AddShards \ < body.json
-
-
Check the server response to make sure your request was successful.
Getting a list of cluster shards
- In the management console
, select the folder the cluster is in. - Navigate to the Managed Service for ClickHouse service.
- Click the name of your cluster and select the Shards tab.
If you do not have the Yandex Cloud CLI yet, install and initialize it.
The folder used by default is the one specified when creating the CLI profile. To change the default folder, use the yc config set folder-id <folder_ID> command. You can also set a different folder for any specific command using the --folder-name or --folder-id options.
To get a list of cluster shards, run this command:
yc managed-clickhouse shards list --cluster-name=<cluster_name>
You can get the cluster name with the list of clusters in the folder.
-
Get an IAM token for API authentication and put it into an environment variable:
export IAM_TOKEN="<IAM_token>" -
Call the Cluster.ListShards method, e.g., via the following cURL
request:curl \ --request GET \ --header "Authorization: Bearer $IAM_TOKEN" \ --url 'https://mdb.api.cloud.yandex.net/managed-clickhouse/v1/clusters/<cluster_ID>/shards'You can get the cluster ID with the list of clusters in the folder.
-
Check the server response to make sure your request was successful.
-
Get an IAM token for API authentication and put it into an environment variable:
export IAM_TOKEN="<IAM_token>" -
Clone the cloudapi
repository:cd ~/ && git clone --depth=1 https://github.com/yandex-cloud/cloudapiBelow, we assume that the repository contents reside in the
~/cloudapi/directory. -
Call the ClusterService.ListShards method, e.g., via the following gRPCurl
request:grpcurl \ -format json \ -import-path ~/cloudapi/ \ -import-path ~/cloudapi/third_party/googleapis/ \ -proto ~/cloudapi/yandex/cloud/mdb/clickhouse/v1/cluster_service.proto \ -rpc-header "Authorization: Bearer $IAM_TOKEN" \ -d '{ "cluster_id": "<cluster_ID>" }' \ mdb.api.cloud.yandex.net:443 \ yandex.cloud.mdb.clickhouse.v1.ClusterService.ListShardsYou can get the cluster ID with the list of clusters in the folder.
-
View the server response to make sure your request was successful.
Updating a shard
You can edit the shard weight as well as the host class, disk type, and storage size.
Warning
When you change the disk type, the cluster hosts get recreated. The system automatically saves the replicated tables data. The non-replicated tables data will be lost. Before changing the disk type, either convert non-replicated tables to replicated ones or create a backup.
Note
To change the disk type to local-ssd, contact support
- In the management console
, select the folder the cluster is in. - Navigate to the Managed Service for ClickHouse service.
- Click the name of your cluster and select the Shards tab.
- Click
and select Edit.
If you do not have the Yandex Cloud CLI yet, install and initialize it.
The folder used by default is the one specified when creating the CLI profile. To change the default folder, use the yc config set folder-id <folder_ID> command. You can also set a different folder for any specific command using the --folder-name or --folder-id options.
To update a shard in a cluster:
-
View the description of the CLI command for updating a shard:
yc managed-clickhouse shards update --help -
Provide the parameters you want to edit to the command:
yc managed-clickhouse shards update <shard_name> \ --cluster-name <cluster_name> \ --weight <shard_weight> \ --clickhouse-resource-preset <host_class> \ --clickhouse-disk-size <storage_size> \ --clickhouse-disk-type <disk_type>Where:
--cluster-name: Cluster name. You can get it with the list of clusters in the folder.--weight: Shard weight. The minimum value is0.--clickhouse-resource-preset: Host class.--clickhouse-disk-size: Storage size, in GB.--clickhouse-disk-type: Disk type.
You can get the shard name with the list of shards in the cluster.
-
Get an IAM token for API authentication and put it into an environment variable:
export IAM_TOKEN="<IAM_token>" -
Call the Cluster.UpdateShard method, e.g., via the following cURL
request:Warning
The API method will assign default values to all the parameters of the object you are modifying unless you explicitly provide them in your request. To avoid this, list the settings you want to change in the
updateMaskparameter as a single comma-separated string.curl \ --request PATCH \ --header "Authorization: Bearer $IAM_TOKEN" \ --header "Content-Type: application/json" \ --url 'https://mdb.api.cloud.yandex.net/managed-clickhouse/v1/clusters/<cluster_ID>/shards/<shard_name>' \ --data '{ "updateMask": "configSpec.clickhouse.config.<ClickHouse®_setup>,configSpec.clickhouse.resources,configSpec.clickhouse.weight", "configSpec": { "clickhouse": { "config": { <ClickHouse®_settings> }, "resources": { "resourcePresetId": "<host_class>", "diskSize": "<storage_size_in_bytes>", "diskTypeId": "<disk_type>" }, "weight": "<shard_weight>" } } }'Where:
-
updateMask: Comma-separated list of settings you want to update. -
configSpec.clickhouse: Shard parameters to update:-
config: ClickHouse® settings. For a list of available settings, see the method description. -
resources: Host resources to add to the new shard:resourcePresetId: Host class ID. You can get the list of available host classes with their IDs using the ResourcePreset.List method.diskSize: Disk size, in bytes.diskTypeId: Disk type.
-
weight: Shard weight.By default, each shard is assigned a weight of
1. If you assign a greater value to a single shard, data will be distributed across the shards according to their weights.To calculate the shard priority for data distribution, the system adds up the weights of all shards and then divides each shard's weight by the total. For example, if one shard has a weight of
1and another has a weight of3, then the first shard's priority is1/4and the second shard's priority is3/4. The higher the priority, the more data the shard will get.For more information, see this ClickHouse® article
.
-
You can get the cluster ID with the list of clusters in the folder, and the shard name, with the list of shards in the cluster.
-
-
Check the server response to make sure your request was successful.
-
Get an IAM token for API authentication and put it into an environment variable:
export IAM_TOKEN="<IAM_token>" -
Clone the cloudapi
repository:cd ~/ && git clone --depth=1 https://github.com/yandex-cloud/cloudapiBelow, we assume that the repository contents reside in the
~/cloudapi/directory. -
Call the ClusterService.UpdateShard method, e.g., via the following gRPCurl
request:Warning
The API method will assign default values to all the parameters of the object you are modifying unless you explicitly provide them in your request. To avoid this, list the settings you want to change in the
update_maskparameter as an array ofpaths[]strings.Format for listing settings
"update_mask": { "paths": [ "<setting_1>", "<setting_2>", ... "<setting_N>" ] }grpcurl \ -format json \ -import-path ~/cloudapi/ \ -import-path ~/cloudapi/third_party/googleapis/ \ -proto ~/cloudapi/yandex/cloud/mdb/clickhouse/v1/cluster_service.proto \ -rpc-header "Authorization: Bearer $IAM_TOKEN" \ -d '{ "cluster_id": "<cluster_ID>", "shard_name": "<shard_name>", "update_mask": { "paths": [ "config_spec.clickhouse.config.<ClickHouse®_setup>", "config_spec.clickhouse.resources", "config_spec.clickhouse.weight" ] }, "config_spec": { "clickhouse": { "config": { <ClickHouse®_settings> }, "resources": { "resource_preset_id": "<host_class>", "disk_size": "<storage_size_in_bytes>", "disk_type_id": "<disk_type>" }, "weight": "<shard_weight>" } } }' \ mdb.api.cloud.yandex.net:443 \ yandex.cloud.mdb.clickhouse.v1.ClusterService.UpdateShardWhere:
-
update_mask: List of settings to update as an array of strings (paths[]). -
config_spec.clickhouse: Shard parameters to update:-
config: ClickHouse® settings. For a list of available settings, see the method description. -
resources: Host resources to add to the new shard:resource_preset_id: Host class ID. You can get the list of available host classes with their IDs using the ResourcePreset.List method.disk_size: Disk size, in bytes.disk_type_id: Disk type.
-
weight: Shard weight.By default, each shard is assigned a weight of
1. If you assign a greater value to a single shard, data will be distributed across the shards according to their weights.To calculate the shard priority for data distribution, the system adds up the weights of all shards and then divides each shard's weight by the total. For example, if one shard has a weight of
1and another has a weight of3, then the first shard's priority is1/4and the second shard's priority is3/4. The higher the priority, the more data the shard will get.For more information, see this ClickHouse® article
.
-
You can get the cluster ID with the list of clusters in the folder, and the shard name, with the list of shards in the cluster.
-
-
View the server response to make sure your request was successful.
Deleting a shard
You can delete a shard from a ClickHouse® cluster in case:
- It is not the only shard.
- It is not the only shard in a shard group.
Deleting a shard will delete all tables and data stored on that shard.
- In the management console
, select the folder the cluster is in. - Navigate to the Managed Service for ClickHouse service.
- Click the name of your cluster and select the Shards tab.
- Delete one or multiple shards:
- To delete one shard, click
in its row, and select Delete. - To delete multiple shards in one go, select them and click Delete at the bottom of the screen.
- To delete one shard, click
- In the window that opens, enable Delete shards and click Delete.
If you do not have the Yandex Cloud CLI yet, install and initialize it.
The folder used by default is the one specified when creating the CLI profile. To change the default folder, use the yc config set folder-id <folder_ID> command. You can also set a different folder for any specific command using the --folder-name or --folder-id options.
To delete one or multiple shards from a cluster, run the command below, providing the names of the shards you want to delete. Use the space character as a separator.
The command for deleting a single shard is as follows:
yc managed-clickhouse shards delete \
--cluster-name=<cluster_name> \
<shard_name>
You can get the shard names from the list of cluster shards, and the cluster name, from the list of clusters in your folder.
-
Open the current Terraform configuration file describing your infrastructure.
For more on how to create this file, see Creating a cluster.
-
Delete the relevant
shardandhostsections from the Managed Service for ClickHouse® cluster description. -
Make sure the settings are correct.
-
In the command line, navigate to the directory that contains the current Terraform configuration files defining the infrastructure.
-
Run this command:
terraform validateTerraform will show any errors found in your configuration files.
-
-
Type
yesand press Enter.-
Run this command to view the planned changes:
terraform planIf you described the configuration correctly, the terminal will display a list of the resources to update and their parameters. This is a verification step that does not apply changes to your resources.
-
If everything looks correct, apply the changes:
-
Run this command:
terraform apply -
Confirm updating the resources.
-
Wait for the operation to complete.
-
-
For more information, see this Terraform provider guide.
Timeouts
The Terraform provider sets the following timeouts for Managed Service for ClickHouse® cluster operations:
- Creating a cluster, including by restoring from a backup: 60 minutes.
- Updating a cluster: 90 minutes.
- Deleting a cluster: 30 minutes.
Operations exceeding the timeout are aborted.
How do I change these limits?
Add a timeouts section to the cluster description, e.g.:
resource "yandex_mdb_clickhouse_cluster" "<cluster_name>" {
...
timeouts {
create = "1h30m" # 1 hour 30 minutes
update = "2h" # 2 hours
delete = "30m" # 30 minutes
}
}
-
Get an IAM token for API authentication and put it into an environment variable:
export IAM_TOKEN="<IAM_token>" -
Call the Cluster.DeleteShards method, e.g., via the following cURL
request:curl \ --request DELETE \ --header "Authorization: Bearer $IAM_TOKEN" \ --url 'https://mdb.api.cloud.yandex.net/managed-clickhouse/v1/clusters/<cluster_ID>/shards:batchDelete' \ --data '{ "shardNames": [ <list_of_shard_names> ] }'Where
shardNamesis an array of strings. Each string is the name of a shard to delete. You can get the shard names with the list of shards in the cluster.You can get the cluster ID with the list of clusters in the folder.
-
Check the server response to make sure your request was successful.
-
Get an IAM token for API authentication and put it into an environment variable:
export IAM_TOKEN="<IAM_token>" -
Clone the cloudapi
repository:cd ~/ && git clone --depth=1 https://github.com/yandex-cloud/cloudapiBelow, we assume that the repository contents reside in the
~/cloudapi/directory. -
Call the ClusterService.DeleteShards method, e.g., via the following gRPCurl
request:grpcurl \ -format json \ -import-path ~/cloudapi/ \ -import-path ~/cloudapi/third_party/googleapis/ \ -proto ~/cloudapi/yandex/cloud/mdb/clickhouse/v1/cluster_service.proto \ -rpc-header "Authorization: Bearer $IAM_TOKEN" \ -d '{ "cluster_id": "<cluster_ID>", "shard_names": [ <list_of_shard_names> ] }' \ mdb.api.cloud.yandex.net:443 \ yandex.cloud.mdb.clickhouse.v1.ClusterService.DeleteShardsWhere
shard_namesis an array of strings. Each string is the name of a shard to delete. You can get the shard names with the list of shards in the cluster.You can get the cluster ID with the list of clusters in the folder.
-
Check the server response to make sure your request was successful.
ClickHouse® is a registered trademark of ClickHouse, Inc