Managing shards in a ClickHouse® cluster
You can enable sharding for a cluster as well as add and configure individual shards.
Enabling sharding
Managed Service for ClickHouse® clusters are created with one shard. To start sharding data, add one or more shards and create a distributed table.
Creating a shard
The number of shards in Managed Service for ClickHouse® clusters is limited by the CPU and RAM quotas available to DB clusters in your cloud. To check the resources currently in use, open the Quotas
- In the management console
, go to the folder page and select Managed Service for ClickHouse. - Click the cluster name and go to the Shards tab.
- Click Create shard.
- Specify the shard parameters:
- Name and weight
- To copy the schema from a random replica of one of the shards to the hosts of the new shard, select the Copy the data schema option.
- Required number of hosts
- Click Create shard.
If you do not have the Yandex Cloud command line interface yet, install and initialize it.
The folder specified in the CLI profile is used by default. You can specify a different folder using the --folder-name
or --folder-id
parameter.
To create a shard, run the command below (not all the supported parameters are listed):
yc managed-clickhouse shards add <new_shard_name> \
--cluster-name=<cluster_name> \
--host zone-id=<availability_zone>,`
`subnet-name=<subnet_name>
Where:
-
<new_shard_name>
: Must be unique within the cluster.It may contain Latin letters, numbers, hyphens, and underscores. The maximum length is 63 characters.
-
--cluster-name
: Cluster name.You can request the cluster name with a list of clusters in the folder.
-
--host
: Host parameters:zone-id
: Availability zone.subnet-name
: Subnet name.
Note
Terraform does not allow specifying shard weight.
-
Open the current Terraform configuration file with an infrastructure plan.
For more information about creating this file, see Creating clusters.
-
Add the
CLICKHOUSE
-typehost
section with theshard_name
field filled to the Managed Service for ClickHouse® cluster description or change existing hosts:resource "yandex_mdb_clickhouse_cluster" "<cluster_name>" { ... host { type = "CLICKHOUSE" zone = "<availability_zone>" subnet_id = yandex_vpc_subnet.<subnet_in_availability_zone>.id shard_name = "<shard_name>" } }
-
Make sure the settings are correct.
-
Using the command line, navigate to the folder that contains the up-to-date Terraform configuration files with an infrastructure plan.
-
Run the command:
terraform validate
If there are errors in the configuration files, Terraform will point to them.
-
-
Confirm updating the resources.
-
Run the command to view planned changes:
terraform plan
If the resource configuration descriptions are correct, the terminal will display a list of the resources to modify and their parameters. This is a test step. No resources are updated.
-
If you are happy with the planned changes, apply them:
-
Run the command:
terraform apply
-
Confirm the update of resources.
-
Wait for the operation to complete.
-
-
For more information, see the Terraform provider documentation
Time limits
A Terraform provider sets the timeout for Managed Service for ClickHouse® cluster operations:
- Creating a cluster, including by restoring one from a backup: 60 minutes.
- Editing a cluster: 90 minutes.
- Deleting a cluster: 30 minutes.
Operations exceeding the set timeout are interrupted.
How do I change these limits?
Add the timeouts
block to the cluster description, for example:
resource "yandex_mdb_clickhouse_cluster" "<cluster_name>" {
...
timeouts {
create = "1h30m" # 1 hour 30 minutes
update = "2h" # 2 hours
delete = "30m" # 30 minutes
}
}
-
Get an IAM token for API authentication and put it into the environment variable:
export IAM_TOKEN="<IAM_token>"
-
Use the Cluster.AddShard method and send the following request, e.g., via cURL
:-
Create a file named
body.json
and add the following contents to it:{ "shardName": "<shard_name>", "configSpec": { "clickhouse": { "resources": { "resourcePresetId": "<host_class>", "diskSize": "<storage_size_in_bytes>", "diskTypeId": "<disk_type>" }, "weight": "<shard_weight>" } }, "hostSpecs": [ { "zoneId": "<availability_zone>", "type": "CLICKHOUSE", "subnetId": "<subnet_ID>", "assignPublicIp": <public_access_to_host>, "shardName": "<shard_name>" } ], "copySchema": <data_schema_copying> }
Where:
-
shardName
: Shard name. -
configSpec.clickhouse.resources
: Host resources to add to the new shard:resourcePresetId
: Host class ID. You can request the list of available host classes with their IDs using the ResourcePreset.List method.diskSize
: Disk size in bytes.diskTypeId
: Disk type.
-
configSpec.clickhouse.weight
: Shard weight.By default, each shard is assigned a weight of
1
. If you assign a higher weight to a shard, the data will be distributed among the shards according to their weights.To calculate the shard priority during data distribution, all shard weights are added up, then the weight of each shard is divided by the total. For example, if one shard has a weight of
1
and another has a weight of3
, then the first shard’s priority is1/4
and the second shard’s priority is3/4
. The higher the priority, the more data the shard will get.For more information, see the ClickHouse® documentation
. -
hostSpecs
: Settings of hosts to add to the shard. The settings are represented as an array of elements, one for each host. Each element has the following structure:zoneId
: Availability zone.type
: Host type. You can only addCLICKHOUSE
hosts to your shards.subnetId
: Subnet ID.assignPublicIp
: Internet access to the host via a public IP address,true
orfalse
.shardName
: Shard name.
-
copySchema
: Copying the data schema from a random replica of one of the shards to the hosts of the new shard. The possible values aretrue
orfalse
.
-
-
Run this request:
curl \ --request POST \ --header "Authorization: Bearer $IAM_TOKEN" \ --header "Content-Type: application/json" \ --url 'https://mdb.api.cloud.yandex.net/managed-clickhouse/v1/clusters/<cluster_ID>/shards' \ --data '@body.json'
You can get the cluster ID with a list of clusters in the folder.
-
-
View the server response to make sure the request was successful.
-
Get an IAM token for API authentication and put it into the environment variable:
export IAM_TOKEN="<IAM_token>"
-
Clone the cloudapi
repository:cd ~/ && git clone --depth=1 https://github.com/yandex-cloud/cloudapi
Below, we assume the repository contents are stored in the
~/cloudapi/
directory. -
Use the ClusterService.AddShard call and send the following request, e.g., via gRPCurl
:-
Create a file named
body.json
and add the following contents to it:{ "cluster_id": "<cluster_ID>", "shard_name": "<shard_name>", "config_spec": { "clickhouse": { "resources": { "resource_preset_id": "<host_class>", "disk_size": "<storage_size_in_bytes>", "disk_type_id": "<disk_type>" }, "weight": "<shard_weight>" } }, "host_specs": [ { "zone_id": "<availability_zone>", "type": "CLICKHOUSE", "subnet_id": "<subnet_ID>", "assign_public_ip": <public_access_to_host>, "shard_name": "<shard_name>" } ], "copy_schema": <data_schema_copying> }
Where:
-
shard_name
: Shard name. -
config_spec.clickhouse.resources
: Host resources to add to the new shard:resource_preset_id
: Host class ID. You can request the list of available host classes with their IDs using the ResourcePresetService.List method.disk_size
: Disk size in bytes.disk_type_id
: Disk type.
-
config_spec.clickhouse.weight
: Shard weight.By default, each shard is assigned a weight of
1
. If you assign a higher weight to a shard, the data will be distributed among the shards according to their weights.To calculate the shard priority during data distribution, all shard weights are added up, then the weight of each shard is divided by the total. For example, if one shard has a weight of
1
and another has a weight of3
, then the first shard’s priority is1/4
and the second shard’s priority is3/4
. The higher the priority, the more data the shard will get.For more information, see the ClickHouse® documentation
. -
host_specs
: Settings of hosts to add to the shard. The settings are represented as an array of elements, one for each host. Each element has the following structure:zone_id
: Availability zone.type
: Host type. You can only addCLICKHOUSE
hosts to your shards.subnet_id
: Subnet ID.assign_public_ip
: Internet access to the host via a public IP address,true
orfalse
.shard_name
: Shard name.
-
copy_schema
: Copying the data schema from a random replica of one of the shards to the hosts of the new shard. The possible values aretrue
orfalse
.
You can get the cluster ID with a list of clusters in the folder.
-
-
Run this request:
grpcurl \ -format json \ -import-path ~/cloudapi/ \ -import-path ~/cloudapi/third_party/googleapis/ \ -proto ~/cloudapi/yandex/cloud/mdb/clickhouse/v1/cluster_service.proto \ -rpc-header "Authorization: Bearer $IAM_TOKEN" \ -d @ \ mdb.api.cloud.yandex.net:443 \ yandex.cloud.mdb.clickhouse.v1.ClusterService.AddShard \ < body.json
-
-
View the server response to make sure the request was successful.
Warning
Use the copy data schema option only if the schema is the same on all cluster shards.
Listing shards in a cluster
- In the management console
, go to the folder page and select Managed Service for ClickHouse. - Click the name of the cluster you need and select the Shards tab.
If you do not have the Yandex Cloud command line interface yet, install and initialize it.
The folder specified in the CLI profile is used by default. You can specify a different folder using the --folder-name
or --folder-id
parameter.
To get a list of shards in a cluster, run the following command:
yc managed-clickhouse shards list --cluster-name=<cluster_name>
You can request the cluster name with a list of clusters in the folder.
-
Get an IAM token for API authentication and put it into the environment variable:
export IAM_TOKEN="<IAM_token>"
-
Use the Cluster.ListShards method and send the following request, e.g., via cURL
:curl \ --request GET \ --header "Authorization: Bearer $IAM_TOKEN" \ --url 'https://mdb.api.cloud.yandex.net/managed-clickhouse/v1/clusters/<cluster_ID>/shards'
You can get the cluster ID with a list of clusters in the folder.
-
View the server response to make sure the request was successful.
-
Get an IAM token for API authentication and put it into the environment variable:
export IAM_TOKEN="<IAM_token>"
-
Clone the cloudapi
repository:cd ~/ && git clone --depth=1 https://github.com/yandex-cloud/cloudapi
Below, we assume the repository contents are stored in the
~/cloudapi/
directory. -
Use the ClusterService.ListShards call and send the following request, e.g., via gRPCurl
:grpcurl \ -format json \ -import-path ~/cloudapi/ \ -import-path ~/cloudapi/third_party/googleapis/ \ -proto ~/cloudapi/yandex/cloud/mdb/clickhouse/v1/cluster_service.proto \ -rpc-header "Authorization: Bearer $IAM_TOKEN" \ -d '{ "cluster_id": "<cluster_ID>" }' \ mdb.api.cloud.yandex.net:443 \ yandex.cloud.mdb.clickhouse.v1.ClusterService.ListShards
You can get the cluster ID with a list of clusters in the folder.
-
View the server response to make sure the request was successful.
Changing a shard
You can change the shard weight as well as host class and storage size.
- In the management console
, go to the folder page and select Managed Service for ClickHouse. - Click the name of the cluster you need and select the Shards tab.
- Click
and select Edit.
If you do not have the Yandex Cloud command line interface yet, install and initialize it.
The folder specified in the CLI profile is used by default. You can specify a different folder using the --folder-name
or --folder-id
parameter.
To change a shard in the cluster:
-
View a description of the CLI's shard change command:
yc managed-clickhouse shards update --help
-
Start an operation, e.g., changing shard weight:
yc managed-clickhouse shards update <shard_name> \ --cluster-name=<cluster_name> \ --weight=<shard_weight>
Where:
-
<shard_name>
: Can be requested with a list of shards in the cluster. -
--cluster-name
: Cluster name.You can request the cluster name with a list of clusters in the folder.
-
--weight
: Shard weight. The minimum value is0
.
When the operation is complete, the CLI displays information about the changed shard.
-
-
Get an IAM token for API authentication and put it into the environment variable:
export IAM_TOKEN="<IAM_token>"
-
Use the Cluster.UpdateShard method and send the following request, e.g., via cURL
:Warning
The API method will assign default values to all the parameters of the object you are modifying unless you explicitly provide them in your request. To avoid this, list the settings you want to change in the
updateMask
parameter as a single comma-separated string.curl \ --request PATCH \ --header "Authorization: Bearer $IAM_TOKEN" \ --header "Content-Type: application/json" \ --url 'https://mdb.api.cloud.yandex.net/managed-clickhouse/v1/clusters/<cluster_ID>/shards/<shard_name>' \ --data '{ "updateMask": "configSpec.clickhouse.config.<ClickHouse®_setup>,configSpec.clickhouse.resources,configSpec.clickhouse.weight", "configSpec": { "clickhouse": { "config": { <ClickHouse®_settings> }, "resources": { "resourcePresetId": "<host_class>", "diskSize": "<storage_size_in_bytes>", "diskTypeId": "<disk_type>" }, "weight": "<shard_weight>" } } }'
Where:
-
updateMask
: List of parameters to update as a single string, separated by commas. -
configSpec.clickhouse
: Shard parameters to update:-
config
: ClickHouse® settings. For a list of available settings, see the method description. -
resources
: Host resources to add to the new shard:resourcePresetId
: Host class ID. You can request the list of available host classes with their IDs using the ResourcePreset.List method.diskSize
: Disk size in bytes.diskTypeId
: Disk type.
-
weight
: Shard weight.By default, each shard is assigned a weight of
1
. If you assign a higher weight to a shard, the data will be distributed among the shards according to their weights.To calculate the shard priority during data distribution, all shard weights are added up, then the weight of each shard is divided by the total. For example, if one shard has a weight of
1
and another has a weight of3
, then the first shard’s priority is1/4
and the second shard’s priority is3/4
. The higher the priority, the more data the shard will get.For more information, see the ClickHouse® documentation
.
-
You can request the cluster ID with a list of clusters in the folder and the shard name with a list of shards in the cluster.
-
-
View the server response to make sure the request was successful.
-
Get an IAM token for API authentication and put it into the environment variable:
export IAM_TOKEN="<IAM_token>"
-
Clone the cloudapi
repository:cd ~/ && git clone --depth=1 https://github.com/yandex-cloud/cloudapi
Below, we assume the repository contents are stored in the
~/cloudapi/
directory. -
Use the ClusterService.UpdateShard call and send the following request, e.g., via gRPCurl
:Warning
The API method will assign default values to all the parameters of the object you are modifying unless you explicitly provide them in your request. To avoid this, list the settings you want to change in the
update_mask
parameter as an array ofpaths[]
strings.Format for listing settings
"update_mask": { "paths": [ "<setting_1>", "<setting_2>", ... "<setting_N>" ] }
grpcurl \ -format json \ -import-path ~/cloudapi/ \ -import-path ~/cloudapi/third_party/googleapis/ \ -proto ~/cloudapi/yandex/cloud/mdb/clickhouse/v1/cluster_service.proto \ -rpc-header "Authorization: Bearer $IAM_TOKEN" \ -d '{ "cluster_id": "<cluster_ID>", "shard_name": "<shard_name>", "update_mask": { "paths": [ "config_spec.clickhouse.config.<ClickHouse®_setup>", "config_spec.clickhouse.resources", "config_spec.clickhouse.weight" ] }, "config_spec": { "clickhouse": { "config": { <ClickHouse®_settings> }, "resources": { "resource_preset_id": "<host_class>", "disk_size": "<storage_size_in_bytes>", "disk_type_id": "<disk_type>" }, "weight": "<shard_weight>" } } }' \ mdb.api.cloud.yandex.net:443 \ yandex.cloud.mdb.clickhouse.v1.ClusterService.UpdateShard
Where:
-
update_mask
: List of parameters to update as an array ofpaths[]
strings. -
config_spec.clickhouse
: Shard parameters to update:-
config
: ClickHouse® settings. For a list of available settings, see the method description. -
resources
: Host resources to add to the new shard:resource_preset_id
: Host class ID. You can request the list of available host classes with their IDs using the ResourcePreset.List method.disk_size
: Disk size in bytes.disk_type_id
: Disk type.
-
weight
: Shard weight.By default, each shard is assigned a weight of
1
. If you assign a higher weight to a shard, the data will be distributed among the shards according to their weights.To calculate the shard priority during data distribution, all shard weights are added up, then the weight of each shard is divided by the total. For example, if one shard has a weight of
1
and another has a weight of3
, then the first shard’s priority is1/4
and the second shard’s priority is3/4
. The higher the priority, the more data the shard will get.For more information, see the ClickHouse® documentation
.
-
You can request the cluster ID with a list of clusters in the folder and the shard name with a list of shards in the cluster.
-
-
View the server response to make sure the request was successful.
Deleting a shard
You can delete a shard from a ClickHouse® cluster in case:
- It is not the only shard.
- It is not the only shard in a shard group.
When you delete a shard, all tables and data that are saved on that shard are deleted.
- In the management console
, go to the folder page and select Managed Service for ClickHouse. - Click the cluster name and open the Shards tab.
- Click
in the host's row and select Delete.
If you do not have the Yandex Cloud command line interface yet, install and initialize it.
The folder specified in the CLI profile is used by default. You can specify a different folder using the --folder-name
or --folder-id
parameter.
To delete a shard from the cluster, run:
yc managed-clickhouse shards delete <shard_name> \
--cluster-name=<cluster_name>
You can request the shard name with a list of shards in the cluster and the cluster name with a list of clusters in the folder.
-
Open the current Terraform configuration file with an infrastructure plan.
For more information about creating this file, see Creating clusters.
-
Remove the
host
section with the shard description from the Managed Service for ClickHouse® cluster description. -
Make sure the settings are correct.
-
Using the command line, navigate to the folder that contains the up-to-date Terraform configuration files with an infrastructure plan.
-
Run the command:
terraform validate
If there are errors in the configuration files, Terraform will point to them.
-
-
Type
yes
and press Enter.-
Run the command to view planned changes:
terraform plan
If the resource configuration descriptions are correct, the terminal will display a list of the resources to modify and their parameters. This is a test step. No resources are updated.
-
If you are happy with the planned changes, apply them:
-
Run the command:
terraform apply
-
Confirm the update of resources.
-
Wait for the operation to complete.
-
-
For more information, see the Terraform provider documentation
Time limits
A Terraform provider sets the timeout for Managed Service for ClickHouse® cluster operations:
- Creating a cluster, including by restoring one from a backup: 60 minutes.
- Editing a cluster: 90 minutes.
- Deleting a cluster: 30 minutes.
Operations exceeding the set timeout are interrupted.
How do I change these limits?
Add the timeouts
block to the cluster description, for example:
resource "yandex_mdb_clickhouse_cluster" "<cluster_name>" {
...
timeouts {
create = "1h30m" # 1 hour 30 minutes
update = "2h" # 2 hours
delete = "30m" # 30 minutes
}
}
-
Get an IAM token for API authentication and put it into the environment variable:
export IAM_TOKEN="<IAM_token>"
-
Use the Cluster.DeleteShard method and send the following request, e.g., via cURL
:curl \ --request DELETE \ --header "Authorization: Bearer $IAM_TOKEN" \ --url 'https://mdb.api.cloud.yandex.net/managed-clickhouse/v1/clusters/<cluster_ID>/shards/<shard_name>'
You can request the cluster ID with a list of clusters in the folder and the shard name with a list of shards in the cluster.
-
View the server response to make sure the request was successful.
-
Get an IAM token for API authentication and put it into the environment variable:
export IAM_TOKEN="<IAM_token>"
-
Clone the cloudapi
repository:cd ~/ && git clone --depth=1 https://github.com/yandex-cloud/cloudapi
Below, we assume the repository contents are stored in the
~/cloudapi/
directory. -
Use the ClusterService.DeleteShard call and send the following request, e.g., via gRPCurl
:grpcurl \ -format json \ -import-path ~/cloudapi/ \ -import-path ~/cloudapi/third_party/googleapis/ \ -proto ~/cloudapi/yandex/cloud/mdb/clickhouse/v1/cluster_service.proto \ -rpc-header "Authorization: Bearer $IAM_TOKEN" \ -d '{ "cluster_id": "<cluster_ID>", "shard_name": "<shard_name>" }' \ mdb.api.cloud.yandex.net:443 \ yandex.cloud.mdb.clickhouse.v1.ClusterService.DeleteShard
You can request the cluster ID with a list of clusters in the folder and the shard name with a list of shards in the cluster.
-
View the server response to make sure the request was successful.
ClickHouse® is a registered trademark of ClickHouse, Inc