Managing shards in a ClickHouse® cluster
You can enable sharding for a cluster, as well as add and configure individual shards.
Enabling sharding
Managed Service for ClickHouse® clusters are created with a single shard. To start sharding data, add one or more shards and create a distributed table.
Creating a shard
The number of shards in Managed Service for ClickHouse® clusters is limited by the CPU and RAM quotas available to DB clusters in your cloud. To check the resources currently in use, open the Quotas
- In the management console
, navigate to the folder dashboard and select Managed Service for ClickHouse. - Click the cluster name and go to the Shards tab.
- Click Create shard.
- Specify the following shard properties:
- Name and weight.
- To copy the schema from a random replica of one of the shards to the hosts of the new shard, select the Copy the data schema option.
- Required number of hosts.
- Click Create shard.
If you do not have the Yandex Cloud CLI installed yet, install and initialize it.
By default, the CLI uses the folder specified when creating the profile. To change the default folder, use the yc config set folder-id <folder_ID> command. You can also set a different folder for any specific command using the --folder-name or --folder-id parameter.
To create a shard, run the command below (our example does not use all available parameters):
yc managed-clickhouse shards add <new_shard_name> \
--cluster-name=<cluster_name> \
--host zone-id=<availability_zone>,`
`subnet-name=<subnet_name>
Where:
-
<new_shard_name>: New shard name that must be unique within the cluster.It may contain Latin letters, numbers, hyphens, and underscores. The name can be up to 63 characters long.
-
--cluster-name: Cluster name.You can get the cluster name with the list of clusters in the folder.
-
--host: Host settings:zone-id: Availability zone.subnet-name: Subnet name.
Note
Terraform does not support specifying shard weight.
-
Open the current Terraform configuration file that defines your infrastructure.
For more information about creating this file, see this guide.
-
Add the
CLICKHOUSE-typehostsection with theshard_namefield filled to the Managed Service for ClickHouse® cluster description or update existing hosts:resource "yandex_mdb_clickhouse_cluster" "<cluster_name>" { ... host { type = "CLICKHOUSE" zone = "<availability_zone>" subnet_id = yandex_vpc_subnet.<subnet_in_availability_zone>.id shard_name = "<shard_name>" } } -
Make sure the settings are correct.
-
In the command line, navigate to the directory that contains the current Terraform configuration files defining the infrastructure.
-
Run this command:
terraform validateTerraform will show any errors found in your configuration files.
-
-
Confirm updating the resources.
-
Run this command to view the planned changes:
terraform planIf you described the configuration correctly, the terminal will display a list of the resources to update and their parameters. This is a verification step that does not apply changes to your resources.
-
If everything looks correct, apply the changes:
-
Run this command:
terraform apply -
Confirm updating the resources.
-
Wait for the operation to complete.
-
-
For more information, see this Terraform provider article.
Time limits
A Terraform provider sets the timeout for Managed Service for ClickHouse® cluster operations:
- Creating a cluster, including by restoring one from a backup: 60 minutes.
- Editing a cluster: 90 minutes.
- Deleting a cluster: 30 minutes.
Operations exceeding the set timeout are interrupted.
How do I change these limits?
Add the timeouts block to the cluster description, for example:
resource "yandex_mdb_clickhouse_cluster" "<cluster_name>" {
...
timeouts {
create = "1h30m" # 1 hour 30 minutes
update = "2h" # 2 hours
delete = "30m" # 30 minutes
}
}
-
Get an IAM token for API authentication and save it as an environment variable:
export IAM_TOKEN="<IAM_token>" -
Use the Cluster.AddShard method and send the following request, e.g., via cURL
:-
Create a file named
body.jsonand paste the following code into it:{ "shardName": "<shard_name>", "configSpec": { "clickhouse": { "resources": { "resourcePresetId": "<host_class>", "diskSize": "<storage_size_in_bytes>", "diskTypeId": "<disk_type>" }, "weight": "<shard_weight>" } }, "hostSpecs": [ { "zoneId": "<availability_zone>", "type": "CLICKHOUSE", "subnetId": "<subnet_ID>", "assignPublicIp": <public_access_to_host>, "shardName": "<shard_name>" } ], "copySchema": <data_schema_copying> }Where:
-
shardName: Shard name. -
configSpec.clickhouse.resources: Host resources to add to the new shard:resourcePresetId: Host class ID. You can get the list of available host classes with their IDs using the ResourcePreset.List method.diskSize: Disk size, in bytes.diskTypeId: Disk type.
-
configSpec.clickhouse.weight: Shard weight.By default, each shard is assigned a weight of
1. If you assign a higher weight to a shard, the data will be distributed among the shards according to their weights.To calculate the shard priority for data distribution, the system adds up the weights of all shards and then divides each shard's weight by the total. For example, if one shard has a weight of
1and another has a weight of3, then the first shard's priority is1/4and the second shard's priority is3/4. The higher the priority, the more data the shard will get.For more information, see this ClickHouse® article
. -
hostSpecs: Settings of hosts to add to the shard. The settings appear as an array of elements, one for each host. Each element has the following structure:zoneId: Availability zone.type: Host type. You can only addCLICKHOUSEhosts to your shards.subnetId: Subnet ID.assignPublicIp: Internet access to the host via a public IP address,trueorfalse.shardName: Shard name.
-
copySchema: Copying the data schema from a random replica of one of the shards to the hosts of the new shard. The possible values are:trueorfalse.
-
-
Run this query:
curl \ --request POST \ --header "Authorization: Bearer $IAM_TOKEN" \ --header "Content-Type: application/json" \ --url 'https://mdb.api.cloud.yandex.net/managed-clickhouse/v1/clusters/<cluster_ID>/shards' \ --data '@body.json'You can get the cluster ID with the list of clusters in the folder.
-
-
View the server response to make sure your request was successful.
-
Get an IAM token for API authentication and save it as an environment variable:
export IAM_TOKEN="<IAM_token>" -
Clone the cloudapi
repository:cd ~/ && git clone --depth=1 https://github.com/yandex-cloud/cloudapiBelow, we assume the repository contents are stored in the
~/cloudapi/directory. -
Use the ClusterService.AddShard call and send the following request, e.g., via gRPCurl
:-
Create a file named
body.jsonand paste the following code into it:{ "cluster_id": "<cluster_ID>", "shard_name": "<shard_name>", "config_spec": { "clickhouse": { "resources": { "resource_preset_id": "<host_class>", "disk_size": "<storage_size_in_bytes>", "disk_type_id": "<disk_type>" }, "weight": "<shard_weight>" } }, "host_specs": [ { "zone_id": "<availability_zone>", "type": "CLICKHOUSE", "subnet_id": "<subnet_ID>", "assign_public_ip": <public_access_to_host>, "shard_name": "<shard_name>" } ], "copy_schema": <data_schema_copying> }Where:
-
shard_name: Shard name. -
config_spec.clickhouse.resources: Host resources to add to the new shard:resource_preset_id: Host class ID. You can get the list of available host classes with their IDs using the ResourcePresetService.List method.disk_size: Disk size, in bytes.disk_type_id: Disk type.
-
config_spec.clickhouse.weight: Shard weight.By default, each shard is assigned a weight of
1. If you assign a higher weight to a shard, the data will be distributed among the shards according to their weights.To calculate the shard priority for data distribution, the system adds up the weights of all shards and then divides each shard's weight by the total. For example, if one shard has a weight of
1and another has a weight of3, then the first shard's priority is1/4and the second shard's priority is3/4. The higher the priority, the more data the shard will get.For more information, see this ClickHouse® article
. -
host_specs: Settings of hosts to add to the shard. The settings are represented as an array of elements, one for each host. Each element has the following structure:zone_id: Availability zone.type: Host type. You can only addCLICKHOUSEhosts to your shards.subnet_id: Subnet ID.assign_public_ip: Internet access to the host via a public IP address,trueorfalse.shard_name: Shard name.
-
copy_schema: Copying the data schema from a random replica of one of the shards to the hosts of the new shard. The possible values are:trueorfalse.
You can get the cluster ID with the list of clusters in the folder.
-
-
Run this query:
grpcurl \ -format json \ -import-path ~/cloudapi/ \ -import-path ~/cloudapi/third_party/googleapis/ \ -proto ~/cloudapi/yandex/cloud/mdb/clickhouse/v1/cluster_service.proto \ -rpc-header "Authorization: Bearer $IAM_TOKEN" \ -d @ \ mdb.api.cloud.yandex.net:443 \ yandex.cloud.mdb.clickhouse.v1.ClusterService.AddShard \ < body.json
-
-
View the server response to make sure your request was successful.
Warning
Use the copy schema option only if the schema is the same across all cluster shards.
Getting a list of shards in a cluster
- In the management console
, navigate to the folder dashboard and select Managed Service for ClickHouse. - Click the name of your cluster and open the Shards tab.
If you do not have the Yandex Cloud CLI installed yet, install and initialize it.
By default, the CLI uses the folder specified when creating the profile. To change the default folder, use the yc config set folder-id <folder_ID> command. You can also set a different folder for any specific command using the --folder-name or --folder-id parameter.
To get a list of shards in a cluster, run the following command:
yc managed-clickhouse shards list --cluster-name=<cluster_name>
You can get the cluster name with the list of clusters in the folder.
-
Get an IAM token for API authentication and save it as an environment variable:
export IAM_TOKEN="<IAM_token>" -
Use the Cluster.ListShards method and send the following request, e.g., via cURL
:curl \ --request GET \ --header "Authorization: Bearer $IAM_TOKEN" \ --url 'https://mdb.api.cloud.yandex.net/managed-clickhouse/v1/clusters/<cluster_ID>/shards'You can get the cluster ID with the list of clusters in the folder.
-
View the server response to make sure your request was successful.
-
Get an IAM token for API authentication and save it as an environment variable:
export IAM_TOKEN="<IAM_token>" -
Clone the cloudapi
repository:cd ~/ && git clone --depth=1 https://github.com/yandex-cloud/cloudapiBelow, we assume the repository contents are stored in the
~/cloudapi/directory. -
Use the ClusterService.ListShards call and send the following request, e.g., via gRPCurl
:grpcurl \ -format json \ -import-path ~/cloudapi/ \ -import-path ~/cloudapi/third_party/googleapis/ \ -proto ~/cloudapi/yandex/cloud/mdb/clickhouse/v1/cluster_service.proto \ -rpc-header "Authorization: Bearer $IAM_TOKEN" \ -d '{ "cluster_id": "<cluster_ID>" }' \ mdb.api.cloud.yandex.net:443 \ yandex.cloud.mdb.clickhouse.v1.ClusterService.ListShardsYou can get the cluster ID with the list of clusters in the folder.
-
View the server response to make sure your request was successful.
Updating a shard
You can edit the shard weight as well as the host class, disk type, and storage size.
Note
To change the disk type to local-ssd, contact support
- In the management console
, navigate to the folder dashboard and select Managed Service for ClickHouse. - Click the name of your cluster and open the Shards tab.
- Click
and select Edit.
If you do not have the Yandex Cloud CLI installed yet, install and initialize it.
By default, the CLI uses the folder specified when creating the profile. To change the default folder, use the yc config set folder-id <folder_ID> command. You can also set a different folder for any specific command using the --folder-name or --folder-id parameter.
To update a shard in a cluster:
-
See the description of the CLI command for updating a shard:
yc managed-clickhouse shards update --help -
Provide the parameters you want to edit to the command:
yc managed-clickhouse shards update <shard_name> \ --cluster-name <cluster_name> \ --weight <shard_weight> \ --clickhouse-resource-preset <host_class> \ --clickhouse-disk-size <storage_size> \ --clickhouse-disk-type <disk_type>Where:
--cluster-name: Cluster name. You can get it with the list of clusters in a folder.--weight: Shard weight. The minimum value is0.--clickhouse-resource-preset: Host class.--clickhouse-disk-size: Storage size, in GB.--clickhouse-disk-type: Disk type.
You can get the shard name with a list of shards in the cluster.
-
Get an IAM token for API authentication and save it as an environment variable:
export IAM_TOKEN="<IAM_token>" -
Use the Cluster.UpdateShard method and send the following request, e.g., via cURL
:Warning
The API method will assign default values to all the parameters of the object you are modifying unless you explicitly provide them in your request. To avoid this, list the settings you want to change in the
updateMaskparameter as a single comma-separated string.curl \ --request PATCH \ --header "Authorization: Bearer $IAM_TOKEN" \ --header "Content-Type: application/json" \ --url 'https://mdb.api.cloud.yandex.net/managed-clickhouse/v1/clusters/<cluster_ID>/shards/<shard_name>' \ --data '{ "updateMask": "configSpec.clickhouse.config.<ClickHouse®_setup>,configSpec.clickhouse.resources,configSpec.clickhouse.weight", "configSpec": { "clickhouse": { "config": { <ClickHouse®_settings> }, "resources": { "resourcePresetId": "<host_class>", "diskSize": "<storage_size_in_bytes>", "diskTypeId": "<disk_type>" }, "weight": "<shard_weight>" } } }'Where:
-
updateMask: List of parameters to update as a single string, separated by commas. -
configSpec.clickhouse: Shard parameters to update:-
config: ClickHouse® settings. For a list of available settings, see the method description. -
resources: Host resources to add to the new shard:resourcePresetId: Host class ID. You can get the list of available host classes with their IDs using the ResourcePreset.List method.diskSize: Disk size, in bytes.diskTypeId: Disk type.
-
weight: Shard weight.By default, each shard is assigned a weight of
1. If you assign a higher weight to a shard, the data will be distributed among the shards according to their weights.To calculate the shard priority for data distribution, the system adds up the weights of all shards and then divides each shard's weight by the total. For example, if one shard has a weight of
1and another has a weight of3, then the first shard's priority is1/4and the second shard's priority is3/4. The higher the priority, the more data the shard will get.For more information, see this ClickHouse® article
.
-
You can get the cluster ID with the list of clusters in the folder, and the shard name, with the list of shards in the cluster.
-
-
View the server response to make sure your request was successful.
-
Get an IAM token for API authentication and save it as an environment variable:
export IAM_TOKEN="<IAM_token>" -
Clone the cloudapi
repository:cd ~/ && git clone --depth=1 https://github.com/yandex-cloud/cloudapiBelow, we assume the repository contents are stored in the
~/cloudapi/directory. -
Use the ClusterService.UpdateShard call and send the following request, e.g., via gRPCurl
:Warning
The API method will assign default values to all the parameters of the object you are modifying unless you explicitly provide them in your request. To avoid this, list the settings you want to change in the
update_maskparameter as an array ofpaths[]strings.Format for listing settings
"update_mask": { "paths": [ "<setting_1>", "<setting_2>", ... "<setting_N>" ] }grpcurl \ -format json \ -import-path ~/cloudapi/ \ -import-path ~/cloudapi/third_party/googleapis/ \ -proto ~/cloudapi/yandex/cloud/mdb/clickhouse/v1/cluster_service.proto \ -rpc-header "Authorization: Bearer $IAM_TOKEN" \ -d '{ "cluster_id": "<cluster_ID>", "shard_name": "<shard_name>", "update_mask": { "paths": [ "config_spec.clickhouse.config.<ClickHouse®_setup>", "config_spec.clickhouse.resources", "config_spec.clickhouse.weight" ] }, "config_spec": { "clickhouse": { "config": { <ClickHouse®_settings> }, "resources": { "resource_preset_id": "<host_class>", "disk_size": "<storage_size_in_bytes>", "disk_type_id": "<disk_type>" }, "weight": "<shard_weight>" } } }' \ mdb.api.cloud.yandex.net:443 \ yandex.cloud.mdb.clickhouse.v1.ClusterService.UpdateShardWhere:
-
update_mask: List of parameters to update as an array ofpaths[]strings. -
config_spec.clickhouse: Shard parameters to update:-
config: ClickHouse® settings. For a list of available settings, see the method description. -
resources: Host resources to add to the new shard:resource_preset_id: Host class ID. You can get the list of available host classes with their IDs using the ResourcePreset.List method.disk_size: Disk size, in bytes.disk_type_id: Disk type.
-
weight: Shard weight.By default, each shard is assigned a weight of
1. If you assign a higher weight to a shard, the data will be distributed among the shards according to their weights.To calculate the shard priority for data distribution, the system adds up the weights of all shards and then divides each shard's weight by the total. For example, if one shard has a weight of
1and another has a weight of3, then the first shard's priority is1/4and the second shard's priority is3/4. The higher the priority, the more data the shard will get.For more information, see this ClickHouse® article
.
-
You can get the cluster ID with the list of clusters in the folder, and the shard name, with the list of shards in the cluster.
-
-
View the server response to make sure your request was successful.
Deleting a shard
You can delete a shard from a ClickHouse® cluster in case:
- It is not the only shard.
- It is not the only shard in a shard group.
Deleting a shard will delete all tables and data stored on that shard.
- In the management console
, navigate to the folder dashboard and select Managed Service for ClickHouse. - Click the name of your cluster and open the Shards tab.
- Click
next to the host in question and select Delete.
If you do not have the Yandex Cloud CLI installed yet, install and initialize it.
By default, the CLI uses the folder specified when creating the profile. To change the default folder, use the yc config set folder-id <folder_ID> command. You can also set a different folder for any specific command using the --folder-name or --folder-id parameter.
To delete a shard from a cluster, run this command:
yc managed-clickhouse shards delete <shard_name> \
--cluster-name=<cluster_name>
You can get the shard name with the list of shards in the cluster, and the cluster name, with the list of clusters in the folder.
-
Open the current Terraform configuration file that defines your infrastructure.
For more information about creating this file, see this guide.
-
Remove the
hostsection with the shard description from the Managed Service for ClickHouse® cluster description. -
Validate your configuration.
-
In the command line, navigate to the directory that contains the current Terraform configuration files defining the infrastructure.
-
Run this command:
terraform validateTerraform will show any errors found in your configuration files.
-
-
Type
yesand press Enter.-
Run this command to view the planned changes:
terraform planIf you described the configuration correctly, the terminal will display a list of the resources to update and their parameters. This is a verification step that does not apply changes to your resources.
-
If everything looks correct, apply the changes:
-
Run this command:
terraform apply -
Confirm updating the resources.
-
Wait for the operation to complete.
-
-
For more information, see this Terraform provider article.
Time limits
A Terraform provider sets the timeout for Managed Service for ClickHouse® cluster operations:
- Creating a cluster, including by restoring one from a backup: 60 minutes.
- Editing a cluster: 90 minutes.
- Deleting a cluster: 30 minutes.
Operations exceeding the set timeout are interrupted.
How do I change these limits?
Add the timeouts block to the cluster description, for example:
resource "yandex_mdb_clickhouse_cluster" "<cluster_name>" {
...
timeouts {
create = "1h30m" # 1 hour 30 minutes
update = "2h" # 2 hours
delete = "30m" # 30 minutes
}
}
-
Get an IAM token for API authentication and save it as an environment variable:
export IAM_TOKEN="<IAM_token>" -
Use the Cluster.DeleteShard method and send the following request, e.g., via cURL
:curl \ --request DELETE \ --header "Authorization: Bearer $IAM_TOKEN" \ --url 'https://mdb.api.cloud.yandex.net/managed-clickhouse/v1/clusters/<cluster_ID>/shards/<shard_name>'You can request the cluster ID with the list of clusters in the folder and the shard name, with the list of shards in the cluster.
-
View the server response to make sure your request was successful.
-
Get an IAM token for API authentication and save it as an environment variable:
export IAM_TOKEN="<IAM_token>" -
Clone the cloudapi
repository:cd ~/ && git clone --depth=1 https://github.com/yandex-cloud/cloudapiBelow, we assume the repository contents are stored in the
~/cloudapi/directory. -
Use the ClusterService.DeleteShard call and send the following request, e.g., via gRPCurl
:grpcurl \ -format json \ -import-path ~/cloudapi/ \ -import-path ~/cloudapi/third_party/googleapis/ \ -proto ~/cloudapi/yandex/cloud/mdb/clickhouse/v1/cluster_service.proto \ -rpc-header "Authorization: Bearer $IAM_TOKEN" \ -d '{ "cluster_id": "<cluster_ID>", "shard_name": "<shard_name>" }' \ mdb.api.cloud.yandex.net:443 \ yandex.cloud.mdb.clickhouse.v1.ClusterService.DeleteShardYou can get the cluster ID with the list of clusters in the folder, and the shard name, with the list of shards in the cluster.
-
View the server response to make sure your request was successful.
ClickHouse® is a registered trademark of ClickHouse, Inc