Managing shards in a ClickHouse® cluster
You can enable sharding for a cluster, as well as add and configure individual shards.
Enabling sharding
Managed Service for ClickHouse® clusters are created with a single shard. To start sharding data, add one or more shards and create a distributed table.
Creating a shard
The number of shards in Managed Service for ClickHouse® clusters is limited by the CPU and RAM quotas available to database clusters in your cloud. To check the resources currently in use, open the Quotas
- In the management console
, select the folder the cluster is in. - Go to Managed Service for ClickHouse.
- Click the cluster name and navigate to the Shards tab.
- Click Create shard.
- Specify the following shard properties:
- Name and weight.
- To copy the schema from a random replica of one of the shards to the hosts of the new shard, select Copy the data schema.
- Required number of hosts.
- Click Create shard.
If you do not have the Yandex Cloud CLI installed yet, install and initialize it.
By default, the CLI uses the folder specified when creating the profile. To change the default folder, use the yc config set folder-id <folder_ID> command. You can also set a different folder for any specific command using the --folder-name or --folder-id parameter.
To create a shard, run the command below (our example does not use all available parameters):
yc managed-clickhouse shards add <new_shard_name> \
--cluster-name=<cluster_name> \
--host zone-id=<availability_zone>,`
`subnet-name=<subnet_name>
Where:
-
<new_shard_name>: New shard name that must be unique within the cluster.It may include Latin letters, numbers, hyphens, and underscores. The name may be up to 63 characters long.
-
--cluster-name: Cluster name.You can get the cluster name with the list of clusters in the folder.
-
--host: Host settings:zone-id: Availability zone.subnet-name: Subnet name.
Note
Terraform does not support specifying a shard weight.
-
Open the current Terraform configuration file describing your infrastructure.
For information on how to create such a file, see Creating a cluster.
-
Add the
CLICKHOUSE-typehostsection with theshard_namefield filled to the Managed Service for ClickHouse® cluster description or update existing hosts:resource "yandex_mdb_clickhouse_cluster" "<cluster_name>" { ... host { type = "CLICKHOUSE" zone = "<availability_zone>" subnet_id = yandex_vpc_subnet.<subnet_in_availability_zone>.id shard_name = "<shard_name>" } } -
Make sure the settings are correct.
-
In the command line, navigate to the directory that contains the current Terraform configuration files defining the infrastructure.
-
Run this command:
terraform validateTerraform will show any errors found in your configuration files.
-
-
Confirm updating the resources.
-
Run this command to view the planned changes:
terraform planIf you described the configuration correctly, the terminal will display a list of the resources to update and their parameters. This is a verification step that does not apply changes to your resources.
-
If everything looks correct, apply the changes:
-
Run this command:
terraform apply -
Confirm updating the resources.
-
Wait for the operation to complete.
-
-
For more information, see this Terraform provider guide.
Timeouts
The Terraform provider sets the following timeouts for Managed Service for ClickHouse® cluster operations:
- Creating a cluster, including by restoring from a backup: 60 minutes.
- Updating a cluster: 90 minutes.
- Deleting a cluster: 30 minutes.
Operations exceeding the timeout are aborted.
How do I change these limits?
Add a timeouts section to the cluster description, e.g.:
resource "yandex_mdb_clickhouse_cluster" "<cluster_name>" {
...
timeouts {
create = "1h30m" # 1 hour 30 minutes
update = "2h" # 2 hours
delete = "30m" # 30 minutes
}
}
-
Get an IAM token for API authentication and place it in an environment variable:
export IAM_TOKEN="<IAM_token>" -
Call the Cluster.AddShard method, e.g., via the following cURL
request:-
Create a file named
body.jsonand paste the following code into it:{ "shardName": "<shard_name>", "configSpec": { "clickhouse": { "resources": { "resourcePresetId": "<host_class>", "diskSize": "<storage_size_in_bytes>", "diskTypeId": "<disk_type>" }, "weight": "<shard_weight>" } }, "hostSpecs": [ { "zoneId": "<availability_zone>", "type": "CLICKHOUSE", "subnetId": "<subnet_ID>", "assignPublicIp": <public_access_to_host>, "shardName": "<shard_name>" } ], "copySchema": <copying_data_schema> }Where:
-
shardName: Shard name. -
configSpec.clickhouse.resources: Host resources to add to the new shard:resourcePresetId: Host class ID. You can get the list of available host classes with their IDs using the ResourcePreset.List method.diskSize: Disk size, in bytes.diskTypeId: Disk type.
-
configSpec.clickhouse.weight: Shard weight.By default, each shard is assigned a weight of
1. If you assign a greater value to a single shard, the data will be distributed among the shards according to their weights.To calculate the shard priority for data distribution, the system adds up the weights of all shards and then divides each shard's weight by the total. For example, if one shard has a weight of
1and another has a weight of3, then the first shard's priority is1/4and the second shard's priority is3/4. The higher the priority, the more data the shard will get.For more information, see this ClickHouse® guide
. -
hostSpecs: Settings of hosts to add to the shard. The settings appear as an array of elements, one per host. Each element has the following structure:zoneId: Availability zone.type: Host type. You can only addCLICKHOUSEhosts to your shards.subnetId: Subnet ID.assignPublicIp: Internet access to the host via a public IP address,trueorfalse.shardName: Shard name.
-
copySchema: Copying the data schema from a random replica of one of the shards to the hosts of the new shard. The possible values aretrueorfalse.
-
-
Run this query:
curl \ --request POST \ --header "Authorization: Bearer $IAM_TOKEN" \ --header "Content-Type: application/json" \ --url 'https://mdb.api.cloud.yandex.net/managed-clickhouse/v1/clusters/<cluster_ID>/shards' \ --data '@body.json'You can get the cluster ID with the list of clusters in the folder.
-
-
View the server response to make sure your request was successful.
-
Get an IAM token for API authentication and place it in an environment variable:
export IAM_TOKEN="<IAM_token>" -
Clone the cloudapi
repository:cd ~/ && git clone --depth=1 https://github.com/yandex-cloud/cloudapiBelow, we assume the repository contents are stored in the
~/cloudapi/directory. -
Call the ClusterService.AddShard method, e.g., via the following gRPCurl
request:-
Create a file named
body.jsonand paste the following code into it:{ "cluster_id": "<cluster_ID>", "shard_name": "<shard_name>", "config_spec": { "clickhouse": { "resources": { "resource_preset_id": "<host_class>", "disk_size": "<storage_size_in_bytes>", "disk_type_id": "<disk_type>" }, "weight": "<shard_weight>" } }, "host_specs": [ { "zone_id": "<availability_zone>", "type": "CLICKHOUSE", "subnet_id": "<subnet_ID>", "assign_public_ip": <public_access_to_host>, "shard_name": "<shard_name>" } ], "copy_schema": <copying_data_schema> }Where:
-
shard_name: Shard name. -
config_spec.clickhouse.resources: Host resources to add to the new shard:resource_preset_id: Host class ID. You can get the list of available host classes with their IDs using the ResourcePresetService.List method.disk_size: Disk size, in bytes.disk_type_id: Disk type.
-
config_spec.clickhouse.weight: Shard weight.By default, each shard is assigned a weight of
1. If you assign a greater value to a single shard, the data will be distributed among the shards according to their weights.To calculate the shard priority for data distribution, the system adds up the weights of all shards and then divides each shard's weight by the total. For example, if one shard has a weight of
1and another has a weight of3, then the first shard's priority is1/4and the second shard's priority is3/4. The higher the priority, the more data the shard will get.For more information, see this ClickHouse® guide
. -
host_specs: Settings of hosts to add to the shard as an array of elements, one per host. Each element has the following structure:zone_id: Availability zone.type: Host type. You can only addCLICKHOUSEhosts to your shards.subnet_id: Subnet ID.assign_public_ip: Internet access to the host via a public IP address,trueorfalse.shard_name: Shard name.
-
copy_schema: Copying the data schema from a random replica of one of the shards to the hosts of the new shard. The possible values aretrueorfalse.
You can get the cluster ID with the list of clusters in the folder.
-
-
Run this query:
grpcurl \ -format json \ -import-path ~/cloudapi/ \ -import-path ~/cloudapi/third_party/googleapis/ \ -proto ~/cloudapi/yandex/cloud/mdb/clickhouse/v1/cluster_service.proto \ -rpc-header "Authorization: Bearer $IAM_TOKEN" \ -d @ \ mdb.api.cloud.yandex.net:443 \ yandex.cloud.mdb.clickhouse.v1.ClusterService.AddShard \ < body.json
-
-
View the server response to make sure your request was successful.
Warning
Use the copy data schema option only if the schema is the same across all cluster shards.
Getting a list of shards in a cluster
- In the management console
, select the folder the cluster is in. - Go to Managed Service for ClickHouse.
- Click the cluster name and select the Shards tab.
If you do not have the Yandex Cloud CLI installed yet, install and initialize it.
By default, the CLI uses the folder specified when creating the profile. To change the default folder, use the yc config set folder-id <folder_ID> command. You can also set a different folder for any specific command using the --folder-name or --folder-id parameter.
To get a list of shards in a cluster, run this command:
yc managed-clickhouse shards list --cluster-name=<cluster_name>
You can get the cluster name with the list of clusters in the folder.
-
Get an IAM token for API authentication and place it in an environment variable:
export IAM_TOKEN="<IAM_token>" -
Call the Cluster.ListShards method, e.g., via the following cURL
request:curl \ --request GET \ --header "Authorization: Bearer $IAM_TOKEN" \ --url 'https://mdb.api.cloud.yandex.net/managed-clickhouse/v1/clusters/<cluster_ID>/shards'You can get the cluster ID with the list of clusters in the folder.
-
View the server response to make sure your request was successful.
-
Get an IAM token for API authentication and place it in an environment variable:
export IAM_TOKEN="<IAM_token>" -
Clone the cloudapi
repository:cd ~/ && git clone --depth=1 https://github.com/yandex-cloud/cloudapiBelow, we assume the repository contents are stored in the
~/cloudapi/directory. -
Call the ClusterService.ListShards method, e.g., via the following gRPCurl
request:grpcurl \ -format json \ -import-path ~/cloudapi/ \ -import-path ~/cloudapi/third_party/googleapis/ \ -proto ~/cloudapi/yandex/cloud/mdb/clickhouse/v1/cluster_service.proto \ -rpc-header "Authorization: Bearer $IAM_TOKEN" \ -d '{ "cluster_id": "<cluster_ID>" }' \ mdb.api.cloud.yandex.net:443 \ yandex.cloud.mdb.clickhouse.v1.ClusterService.ListShardsYou can get the cluster ID with the list of clusters in the folder.
-
View the server response to make sure your request was successful.
Updating a shard
You can edit the shard weight as well as the host class, disk type, and storage size.
Note
To change the disk type to local-ssd, contact support
- In the management console
, select the folder the cluster is in. - Go to Managed Service for ClickHouse.
- Click the cluster name and select the Shards tab.
- Click
and select Edit.
If you do not have the Yandex Cloud CLI installed yet, install and initialize it.
By default, the CLI uses the folder specified when creating the profile. To change the default folder, use the yc config set folder-id <folder_ID> command. You can also set a different folder for any specific command using the --folder-name or --folder-id parameter.
To update a shard in a cluster:
-
View the description of the CLI command for updating a shard:
yc managed-clickhouse shards update --help -
Provide the parameters you want to edit to the command:
yc managed-clickhouse shards update <shard_name> \ --cluster-name <cluster_name> \ --weight <shard_weight> \ --clickhouse-resource-preset <host_class> \ --clickhouse-disk-size <storage_size> \ --clickhouse-disk-type <disk_type>Where:
--cluster-name: Cluster name. You can get it with a list of clusters in a folder.--weight: Shard weight. The minimum value is0.--clickhouse-resource-preset: Host class.--clickhouse-disk-size: Storage size, in GB.--clickhouse-disk-type: Disk type.
You can get the shard name with a list of shards in the cluster.
-
Get an IAM token for API authentication and place it in an environment variable:
export IAM_TOKEN="<IAM_token>" -
Call the Cluster.UpdateShard method, e.g., via the following cURL
request:Warning
The API method will assign default values to all the parameters of the object you are modifying unless you explicitly provide them in your request. To avoid this, list the settings you want to change in the
updateMaskparameter as a single comma-separated string.curl \ --request PATCH \ --header "Authorization: Bearer $IAM_TOKEN" \ --header "Content-Type: application/json" \ --url 'https://mdb.api.cloud.yandex.net/managed-clickhouse/v1/clusters/<cluster_ID>/shards/<shard_name>' \ --data '{ "updateMask": "configSpec.clickhouse.config.<ClickHouse®_setup>,configSpec.clickhouse.resources,configSpec.clickhouse.weight", "configSpec": { "clickhouse": { "config": { <ClickHouse®_settings> }, "resources": { "resourcePresetId": "<host_class>", "diskSize": "<storage_size_in_bytes>", "diskTypeId": "<disk_type>" }, "weight": "<shard_weight>" } } }'Where:
-
updateMask: Comma-separated list of settings you want to update. -
configSpec.clickhouse: Shard parameters to update:-
config: ClickHouse® settings. For a list of available settings, see the method description. -
resources: Host resources to add to the new shard:resourcePresetId: Host class ID. You can get the list of available host classes with their IDs using the ResourcePreset.List method.diskSize: Disk size, in bytes.diskTypeId: Disk type.
-
weight: Shard weight.By default, each shard is assigned a weight of
1. If you assign a greater value to a single shard, the data will be distributed among the shards according to their weights.To calculate the shard priority for data distribution, the system adds up the weights of all shards and then divides each shard's weight by the total. For example, if one shard has a weight of
1and another has a weight of3, then the first shard's priority is1/4and the second shard's priority is3/4. The higher the priority, the more data the shard will get.For more information, see this ClickHouse® guide
.
-
You can get the cluster ID with the list of clusters in the folder, and the shard name, with the list of shards in the cluster.
-
-
View the server response to make sure your request was successful.
-
Get an IAM token for API authentication and place it in an environment variable:
export IAM_TOKEN="<IAM_token>" -
Clone the cloudapi
repository:cd ~/ && git clone --depth=1 https://github.com/yandex-cloud/cloudapiBelow, we assume the repository contents are stored in the
~/cloudapi/directory. -
Call the ClusterService.UpdateShard method, e.g., via the following gRPCurl
request:Warning
The API method will assign default values to all the parameters of the object you are modifying unless you explicitly provide them in your request. To avoid this, list the settings you want to change in the
update_maskparameter as an array ofpaths[]strings.Format for listing settings
"update_mask": { "paths": [ "<setting_1>", "<setting_2>", ... "<setting_N>" ] }grpcurl \ -format json \ -import-path ~/cloudapi/ \ -import-path ~/cloudapi/third_party/googleapis/ \ -proto ~/cloudapi/yandex/cloud/mdb/clickhouse/v1/cluster_service.proto \ -rpc-header "Authorization: Bearer $IAM_TOKEN" \ -d '{ "cluster_id": "<cluster_ID>", "shard_name": "<shard_name>", "update_mask": { "paths": [ "config_spec.clickhouse.config.<ClickHouse®_setup>", "config_spec.clickhouse.resources", "config_spec.clickhouse.weight" ] }, "config_spec": { "clickhouse": { "config": { <ClickHouse®_settings> }, "resources": { "resource_preset_id": "<host_class>", "disk_size": "<storage_size_in_bytes>", "disk_type_id": "<disk_type>" }, "weight": "<shard_weight>" } } }' \ mdb.api.cloud.yandex.net:443 \ yandex.cloud.mdb.clickhouse.v1.ClusterService.UpdateShardWhere:
-
update_mask: List of settings you want to update as an array of strings (paths[]). -
config_spec.clickhouse: Shard parameters to update:-
config: ClickHouse® settings. For a list of available settings, see the method description. -
resources: Host resources to add to the new shard:resource_preset_id: Host class ID. You can get the list of available host classes with their IDs using the ResourcePreset.List method.disk_size: Disk size, in bytes.disk_type_id: Disk type.
-
weight: Shard weight.By default, each shard is assigned a weight of
1. If you assign a greater value to a single shard, the data will be distributed among the shards according to their weights.To calculate the shard priority for data distribution, the system adds up the weights of all shards and then divides each shard's weight by the total. For example, if one shard has a weight of
1and another has a weight of3, then the first shard's priority is1/4and the second shard's priority is3/4. The higher the priority, the more data the shard will get.For more information, see this ClickHouse® guide
.
-
You can get the cluster ID with the list of clusters in the folder, and the shard name, with the list of shards in the cluster.
-
-
View the server response to make sure your request was successful.
Deleting a shard
You can delete a shard from a ClickHouse® cluster in case:
- It is not the only shard.
- It is not the only shard in a shard group.
Deleting a shard will delete all tables and data stored on that shard.
- In the management console
, select the folder the cluster is in. - Go to Managed Service for ClickHouse.
- Click the cluster name and select the Shards tab.
- Click
in the host row and select Delete.
If you do not have the Yandex Cloud CLI installed yet, install and initialize it.
By default, the CLI uses the folder specified when creating the profile. To change the default folder, use the yc config set folder-id <folder_ID> command. You can also set a different folder for any specific command using the --folder-name or --folder-id parameter.
To delete a shard from a cluster, run this command:
yc managed-clickhouse shards delete <shard_name> \
--cluster-name=<cluster_name>
You can get the shard name with the list of shards in the cluster, and the cluster name, with the list of clusters in the folder.
-
Open the current Terraform configuration file describing your infrastructure.
For information on how to create such a file, see Creating a cluster.
-
Remove the
hostsection with the shard description from the Managed Service for ClickHouse® cluster description. -
Make sure the settings are correct.
-
In the command line, navigate to the directory that contains the current Terraform configuration files defining the infrastructure.
-
Run this command:
terraform validateTerraform will show any errors found in your configuration files.
-
-
Type
yesand press Enter.-
Run this command to view the planned changes:
terraform planIf you described the configuration correctly, the terminal will display a list of the resources to update and their parameters. This is a verification step that does not apply changes to your resources.
-
If everything looks correct, apply the changes:
-
Run this command:
terraform apply -
Confirm updating the resources.
-
Wait for the operation to complete.
-
-
For more information, see this Terraform provider guide.
Timeouts
The Terraform provider sets the following timeouts for Managed Service for ClickHouse® cluster operations:
- Creating a cluster, including by restoring from a backup: 60 minutes.
- Updating a cluster: 90 minutes.
- Deleting a cluster: 30 minutes.
Operations exceeding the timeout are aborted.
How do I change these limits?
Add a timeouts section to the cluster description, e.g.:
resource "yandex_mdb_clickhouse_cluster" "<cluster_name>" {
...
timeouts {
create = "1h30m" # 1 hour 30 minutes
update = "2h" # 2 hours
delete = "30m" # 30 minutes
}
}
-
Get an IAM token for API authentication and place it in an environment variable:
export IAM_TOKEN="<IAM_token>" -
Call the Cluster.DeleteShard method, e.g., via the following cURL
request:curl \ --request DELETE \ --header "Authorization: Bearer $IAM_TOKEN" \ --url 'https://mdb.api.cloud.yandex.net/managed-clickhouse/v1/clusters/<cluster_ID>/shards/<shard_name>'You can get the cluster ID with the list of clusters in the folder, and the shard name, with the list of shards in the cluster.
-
View the server response to make sure your request was successful.
-
Get an IAM token for API authentication and place it in an environment variable:
export IAM_TOKEN="<IAM_token>" -
Clone the cloudapi
repository:cd ~/ && git clone --depth=1 https://github.com/yandex-cloud/cloudapiBelow, we assume the repository contents are stored in the
~/cloudapi/directory. -
Call the ClusterService.DeleteShard method, e.g., via the following gRPCurl
request:grpcurl \ -format json \ -import-path ~/cloudapi/ \ -import-path ~/cloudapi/third_party/googleapis/ \ -proto ~/cloudapi/yandex/cloud/mdb/clickhouse/v1/cluster_service.proto \ -rpc-header "Authorization: Bearer $IAM_TOKEN" \ -d '{ "cluster_id": "<cluster_ID>", "shard_name": "<shard_name>" }' \ mdb.api.cloud.yandex.net:443 \ yandex.cloud.mdb.clickhouse.v1.ClusterService.DeleteShardYou can get the cluster ID with the list of clusters in the folder, and the shard name, with the list of shards in the cluster.
-
View the server response to make sure your request was successful.
ClickHouse® is a registered trademark of ClickHouse, Inc