Creating a ClickHouse® cluster
A ClickHouse® cluster consists of one or more database hosts between which you can configure replication.
Roles for creating a cluster
To create a Managed Service for ClickHouse® cluster, you need the vpc.user role and the managed-clickhouse.editor role or higher.
To link your service account to a cluster, e.g., to use Yandex Object Storage, make sure your Yandex Cloud account has the iam.serviceAccounts.user role or higher.
For more information about assigning roles, see the Yandex Identity and Access Management documentation.
Creating a cluster
-
Available disk types depend on the selected host class.
-
The number of hosts you can create together with a ClickHouse® cluster depends on the selected disk type and host class.
-
When using ClickHouse® Keeper, a cluster must consist of three or more hosts. You do not need separate hosts to run ClickHouse® Keeper. You can only create this kind of cluster using the Yandex Cloud CLI or API.
-
When using ZooKeeper, a cluster can consist of two or more hosts. Another three ZooKeeper hosts will be added to the cluster automatically.
The minimum number of cores per ZooKeeper host depends on the total number of cores on ClickHouse® hosts. To learn more, see Replication.
To create a Managed Service for ClickHouse® cluster:
-
In the management console
, select the folder where you want to create a DB cluster. -
Select Managed Service for ClickHouse.
-
Click Create cluster.
-
Enter a name for the cluster in the Cluster name field. It must be unique within the folder.
-
Select the environment where you want to create the cluster (you cannot change the environment once the cluster is created):
PRODUCTION
: For stable versions of your apps.PRESTABLE
: For testing purposes. The prestable environment is similar to the production environment and likewise covered by the SLA, but it is the first to get new functionalities, improvements, and bug fixes. In the prestable environment, you can test compatibility of new versions with your application.
-
From the Version drop-down list, select the ClickHouse® version which the Managed Service for ClickHouse® cluster will use. For most clusters, we recommend selecting the latest LTS version.
-
If you are expecting to use data from a Object Storage bucket with restricted access, select a service account from the drop-down list or create a new one. For more information about setting up a service account, see Configuring access to Object Storage.
-
Under Resources:
-
Select the platform, VM type, and host class that defines the technical specifications of the VMs where the DB hosts will be deployed. All available options are listed under Host classes. When you change the host class for a cluster, the specifications of all existing instances also change.
-
Select the disk type.
Warning
You cannot change disk type after you create a cluster.
The selected type determines the increments in which you can change your disk size:
- Network HDD and SSD storage: In increments of 1 GB.
- Local SSD storage:
- For Intel Broadwell and Intel Cascade Lake: In increments of 100 GB.
- For Intel Ice Lake: In increments of 368 GB.
- Non-replicated SSDs and ultra high-speed network SSDs with three replicas: In increments of 93 GB.
-
Select the size of disk to be used for data and backups. For more information on how backups take up storage space, see Backups.
-
-
Under Hosts:
- To create additional DB hosts, click Add host. After you add a second host, the Configure ZooKeeper button will appear. Change the ZooKeeper settings in ZooKeeper host class, ZooKeeper storage size, and ZooKeeper hosts, if required.
- Specify the parameters of the DB hosts that will be created together with the cluster. To change the added host, hover over the host line and click
. - To connect to the host from the internet, enable the Public access setting.
-
Under DBMS settings:
-
If you want to manage cluster users via SQL, select Enabled from the drop-down list in the User management via SQL field and enter the
admin
user password. This disables user management through other interfaces.Otherwise, select Disabled.
-
If you want to manage databases via SQL, select Enabled from the drop-down list in the Managing databases via SQL field. This disables database management through other interfaces. The field is inactive if user management via SQL is disabled.
Otherwise, select Disabled.
Warning
You can't disable activated settings to manage users and databases via SQL. You can enable these as required later when editing cluster settings.
-
Username and password.
The username may contain Latin letters, numbers, hyphens, and underscores but must begin with a letter or underscore. The password must be from 8 to 128 characters long.
-
DB name. The DB name may contain Latin letters, numbers, and underscores. It may be up to 63 characters long. You cannot create a database named
default
. -
Enable hybrid storage for the cluster, if required.
Warning
You cannot disable this option.
-
Configure the DBMS settings, if required. You can specify them later.
Using the Yandex Cloud interfaces, you can manage a limited number of settings. Using SQL queries, you can apply ClickHouse® settings at the query level.
-
-
Under Network settings, select a cloud network to host the cluster and security groups for cluster network traffic. You may need to additionally set up security groups to be able to connect to the cluster.
-
Under Hosts, select the parameters of database hosts created together with the cluster. To change the settings of a host, click the
icon in the line with its number:- Availability zone: Select an availability zone.
- Subnet: Specify a subnet in the selected availability zone.
- Public access: Allow access to the host from the internet.
To add hosts to the cluster, click Add host.
-
Configure cluster service settings, if required:
-
Backup start time (UTC): Time interval during which the cluster backup starts. Time is specified in 24-hour UTC format. The default time is
22:00 - 23:00
UTC. -
Retention period for automatic backups, days: Retention period for automatic backups, in days. If an automatic backup expires, it is deleted. The default is 7 days. For more information, see Backups.
Changing the retention period affects both new automatic backups and existing backups. For example, the initial retention period was 7 days, and the remaining lifetime of a single automatic backup is 1 day. If the retention period increases to 9 days, the remaining lifetime for this backup will now be 3 days.
-
Maintenance window: Maintenance window settings:
- To enable maintenance at any time, select arbitrary (default).
- To specify the preferred maintenance start time, select by schedule and specify the desired day of the week and UTC hour. For example, you can choose a time when the cluster is least loaded.
Maintenance operations are carried out both on enabled and disabled clusters. They may include updating the DBMS, applying patches, and so on.
-
DataLens access: This option allows you to analyze cluster data in Yandex DataLens.
-
WebSQL access: Enables you to run SQL queries against cluster databases from the Yandex Cloud management console using Yandex WebSQL.
-
Access from Metrica and AppMetrica: This option helps import data from AppMetrica
to a cluster. -
Serverless access: Enable this option to allow cluster access from Yandex Cloud Functions. For more information about setting up access, see the Cloud Functions documentation.
-
Yandex Query access: Enable this option to allow cluster access from Yandex Query. This feature is at the Preview stage.
-
Deletion protection: Manages protection of the cluster, its databases, and users against accidental deletion.
Enabled deletion protection will not prevent a manual connection with the purpose to delete database contents.
-
-
Click Create cluster.
If you do not have the Yandex Cloud command line interface yet, install and initialize it.
The folder specified in the CLI profile is used by default. You can specify a different folder using the --folder-name
or --folder-id
parameter.
To create a Managed Service for ClickHouse® cluster:
-
Check whether the folder has any subnets for the cluster hosts:
yc vpc subnet list
If there are no subnets in the folder, create the required subnets in VPC.
-
View the description of the create cluster CLI command:
yc managed-clickhouse cluster create --help
-
Specify cluster parameters in the create command (the list of supported parameters in the example is not exhaustive):
yc managed-clickhouse cluster create \ --name <cluster_name> \ --environment <environment> \ --network-name <network_name> \ --host type=<host_type>,` `zone-id=<availability_zone>,` `subnet-id=<subnet_ID>,` `assign-public-ip=<public_access_to_host> \ --clickhouse-resource-preset <host_class> \ --clickhouse-disk-type <network-hdd|network-ssd|network-ssd-nonreplicated|local-ssd> \ --clickhouse-disk-size <storage_size_in_GB> \ --user name=<username>,password=<user_password> \ --database name=<DB_name> \ --security-group-ids <list_of_security_group_IDs> \ --websql-access=<true_or_false> \ --deletion-protection
You need to specify the
subnet-id
if the selected availability zone has two or more subnets.Where:
-
--environment
: Cluster environment,prestable
orproduction
. -
--host
: Host parameters:type
: Host type:clickhouse
orzookeeper
.zone-id
: Availability zone.assign-public-ip
: Internet access to the host via a public IP address,true
orfalse
.
-
--clickhouse-disk-type
: Disk type.Warning
You cannot change disk type after you create a cluster.
-
--websql-access
: Enables SQL queries against cluster databases from the Yandex Cloud management console using Yandex WebSQL. The default value isfalse
. -
--deletion-protection
: Cluster deletion protection.
Enabled deletion protection will not prevent a manual connection with the purpose to delete database contents.
You can manager cluster users and databases via SQL.
Warning
You can't disable activated settings to manage users and databases via SQL. You can enable these as required later when editing cluster settings.
-
To enable SQL user management:
- set
--enable-sql-user-management
totrue
. - Set a password for the
admin
user in the--admin-password
parameter.
yc managed-clickhouse cluster create \ ... --enable-sql-user-management true \ --admin-password "<admin_user_password>"
- set
-
To enable SQL database management:
- Set
--enable-sql-user-management
and--enable-sql-database-management
totrue
; - Set a password for the
admin
user in the--admin-password
parameter.
yc managed-clickhouse cluster create \ ... --enable-sql-user-management true \ --enable-sql-database-management true \ --admin-password "<admin_user_password>"
- Set
-
To allow access to the cluster from Yandex Cloud Functions, provide the
--serverless-access
parameter. For more information about setting up access, see the Cloud Functions documentation. -
To allow access to the cluster from Yandex Query, provide the
--yandexquery-access=true
parameter. This feature is at the Preview stage. -
To enable ClickHouse® Keeper in a cluster, set the
--embedded-keeper
parameter totrue
.yc managed-clickhouse cluster create \ ... --embedded-keeper true
Alert
You can't disable ClickHouse® Keeper after you create a cluster. ZooKeeper hosts will also become unavailable.
-
To configure hybrid storage settings:
-
Set the
--cloud-storage
parameter totrue
to enable hybrid storage.Note
Once enabled, hybrid storage cannot be disabled.
-
Provide the hybrid storage settings in the relevant parameters:
--cloud-storage-data-cache
: Allows you to cache files in cluster storage. This setting is enabled by default (set totrue
).--cloud-storage-data-cache-max-size
: Sets the maximum cache size (in bytes) allocated in cluster storage for files. The default value is1073741824
(1 GB).--cloud-storage-move-factor
: Sets the minimum share of free space in cluster storage. If the actual value is less than this setting value, the data is moved to Yandex Object Storage. The minimum value is0
, the maximum one is1
, and the default one is0.01
.--cloud-storage-prefer-not-to-merge
: Disables data part merges in cluster and object storage. To disable merges, set the parameter totrue
or provide it with no value. To keep merges enabled, set the parameter tofalse
or do not provide it in the CLI command when creating a cluster.
yc managed-clickhouse cluster create \ ... --cloud-storage=true \ --cloud-storage-data-cache=<file_storage> \ --cloud-storage-data-cache-max-size=<memory_size_in_bytes> \ --cloud-storage-move-factor=<share_of_free_space> \ --cloud-storage-prefer-not-to-merge=<merging_data_parts> ...
Where:
--cloud-storage-data-cache
: Store files in cluster storage,true
orfalse
.--cloud-storage-prefer-not-to-merge
: Disables merging of data parts in cluster and object storage,true
orfalse
.
-
-
Note
When creating a cluster, the anytime
maintenance mode is set by default. You can set a specific maintenance period when updating the cluster settings.
With Terraform
Terraform is distributed under the Business Source License
For more information about the provider resources, see the documentation on the Terraform
To create a Managed Service for ClickHouse® cluster:
-
Using the command line, navigate to the folder that will contain the Terraform configuration files with an infrastructure plan. Create the directory if it does not exist.
-
If you don't have Terraform, install it and configure the Yandex Cloud provider.
-
Create a configuration file describing the cloud network and subnets.
- Network: Description of the cloud network where the cluster will be hosted. If you already have a suitable network, you do not need to describe it again.
- Subnets: Subnets to connect the cluster hosts to. If you already have suitable subnets, you do not need to describe them again.
Example structure of a configuration file that describes a cloud network with a single subnet:
resource "yandex_vpc_network" "<network_name_in_Terraform>" { name = "<network_name>" } resource "yandex_vpc_subnet" "<subnet_name_in_Terraform>" { name = "<subnet_name>" zone = "<availability_zone>" network_id = yandex_vpc_network.<network_name_in_Terraform>.id v4_cidr_blocks = ["<subnet>"] }
-
Create a configuration file with a description of the cluster and its hosts.
- Database cluster: Description of the cluster and its hosts. Also as required here:
-
Specify DBMS settings. You can specify them later.
Using the Yandex Cloud interfaces, you can manage a limited number of settings. Using SQL queries, you can apply ClickHouse® settings at the query level.
-
Enable deletion protection.
Enabled deletion protection will not prevent a manual connection with the purpose to delete database contents.
-
Example structure of a configuration file that describes a cluster with a single host:
resource "yandex_mdb_clickhouse_cluster" "<cluster_name>" { name = "<cluster_name>" environment = "<environment>" network_id = yandex_vpc_network.<network_name_in_Terraform>.id security_group_ids = ["<list_of_security_group_IDs>"] deletion_protection = <cluster_deletion_protection> clickhouse { resources { resource_preset_id = "<host_class>" disk_type_id = "<disk_type>" disk_size = <storage_size_in_GB> } } database { name = "<DB_name>" } user { name = "<DB_user_name>" password = "<password>" permission { database_name = "<name_of_DB_to_create_user_in>" } } host { type = "CLICKHOUSE" zone = "<availability_zone>" subnet_id = yandex_vpc_subnet.<subnet_name_in_Terraform>.id assign_public_ip = <public_access_to_host> } }
Enabled deletion protection will not prevent a manual connection with the purpose to delete database contents.
-
To set up the maintenance window (for disabled clusters as well), add the
maintenance_window
block to the cluster description:resource "yandex_mdb_clickhouse_cluster" "<cluster_name>" { ... maintenance_window { type = <maintenance_type> day = <day_of_week> hour = <hour> } ... }
Where:
type
: Maintenance type. The possible values include:anytime
: Anytime.weekly
: By schedule.
day
: Day of the week for theweekly
type inDDD
format, e.g.,MON
.hour
: Hour of the day for theweekly
type in theHH
format, e.g.,21
.
-
To enable access from other services and allow running SQL queries from the management console using Yandex WebSQL, add a section named
access
with the settings you need:resource "yandex_mdb_clickhouse_cluster" "<cluster_name>" { ... access { data_lens = <access_from_DataLens> metrika = <access_from_Metrica_and_AppMetrica> serverless = <access_from_Cloud_Functions> yandex_query = <access_from_Yandex_Query> web_sql = <run_SQL_queries_from_management_console> } ... }
Where:
-
data_lens
: Access from DataLens,true
orfalse
. -
metrika
: Access from Yandex Metrica and AppMetrica,true
orfalse
. -
serverless
: Access from Cloud Functions,true
orfalse
. -
yandex_query
: Access from Yandex Query,true
orfalse
. -
web_sql
: Execution of SQL queries from the management console,true
orfalse
.
-
-
You can manager cluster users and databases via SQL.
Warning
You can't disable activated settings to manage users and databases via SQL. You can enable these as required later when editing cluster settings.
-
To enable user management via SQL, expand the cluster description to include a
sql_user_management
field set totrue
and anadmin_password
field containing the password for theadmin
account:resource "yandex_mdb_clickhouse_cluster" "<cluster_name>" { name = "<cluster_name>" ... admin_password = "<admin_password>" sql_user_management = true ... }
-
To enable database management via SQL, expand the cluster description to include a
sql_user_management
field and asql_database_management
field, both set totrue
, as well as theadmin_password
field containing the password for theadmin
account:resource "yandex_mdb_clickhouse_cluster" "<cluster_name>" { name = "<cluster_name>" ... admin_password = "<admin_password>" sql_database_management = true sql_user_management = true ... }
-
For more information about the resources you can create with Terraform, see the provider documentation
. - Database cluster: Description of the cluster and its hosts. Also as required here:
-
Check that the Terraform configuration files are correct:
-
Using the command line, navigate to the folder that contains the up-to-date Terraform configuration files with an infrastructure plan.
-
Run the command:
terraform validate
If there are errors in the configuration files, Terraform will point to them.
-
-
Create a cluster:
-
Run the command to view planned changes:
terraform plan
If the resource configuration descriptions are correct, the terminal will display a list of the resources to modify and their parameters. This is a test step. No resources are updated.
-
If you are happy with the planned changes, apply them:
-
Run the command:
terraform apply
-
Confirm the update of resources.
-
Wait for the operation to complete.
-
All the required resources will be created in the specified folder. You can check resource availability and their settings in the management console
. -
Time limits
A Terraform provider sets the timeout for Managed Service for ClickHouse® cluster operations:
- Creating a cluster, including by restoring one from a backup: 60 minutes.
- Editing a cluster: 90 minutes.
- Deleting a cluster: 30 minutes.
Operations exceeding the set timeout are interrupted.
How do I change these limits?
Add the timeouts
block to the cluster description, for example:
resource "yandex_mdb_clickhouse_cluster" "<cluster_name>" {
...
timeouts {
create = "1h30m" # 1 hour 30 minutes
update = "2h" # 2 hours
delete = "30m" # 30 minutes
}
}
-
Get an IAM token for API authentication and put it into the environment variable:
export IAM_TOKEN="<IAM_token>"
-
Use the Cluster.Create method and send the following request, e.g., via cURL
:-
Create a file named
body.json
and add the following contents to it:Note
This example does not use all available parameters.
{ "folderId": "<folder_ID>", "name": "<cluster_name>", "environment": "<environment>", "networkId": "<network_ID>", "securityGroupIds": [ "<security_group_1_ID>", "<security_group_2_ID>", ... "<security_group_N_ID>" ], "configSpec": { "version": "<ClickHouse®>_version", "embeddedKeeper": <ClickHouse® Keeper_usage>, "clickhouse": { "resources": { "resourcePresetId": "<ClickHouse®>_host_class", "diskSize": "<storage_size_in_bytes>", "diskTypeId": "<disk_type>" } }, "zookeeper": { "resources": { "resourcePresetId": "<ZooKeeper>_host_class", "diskSize": "<storage_size_in_bytes>", "diskTypeId": "<disk_type>" } }, "access": { "dataLens": <access_from_DataLens>, "webSql": <run_SQL_queries_from_management_console>, "metrika": <access_from_Metrica_and_AppMetrica>, "serverless": <access_from_Cloud_Functions>, "dataTransfer": <access_from_Data_Transfer>, "yandexQuery": <access_from_Yandex_Query> }, "cloudStorage": { "enabled": <hybrid_storage_use>, "moveFactor": "<share_of_free_space>", "dataCacheEnabled": <temporary_file_storage>, "dataCacheMaxSize": "<maximum_cache_size_for_file_storage>", "preferNotToMerge": <disabling_merge_of_data_parts> }, "adminPassword": "<admin_user_password>", "sqlUserManagement": <user_management_via_SQL>, "sqlDatabaseManagement": <database_management_via_SQL> }, "databaseSpecs": [ { "name": "<DB_name>" }, { <similar_settings_for_database_2> }, { ... }, { <similar_settings_for_database_N> } ], "userSpecs": [ { "name": "<username>", "password": "<user_password>", "permissions": [ { "databaseName": "<DB_name>" } ] }, { <similar_configuration_for_user_2> }, { ... }, { <similar_configuration_for_user_N> } ], "hostSpecs": [ { "zoneId": "<availability_zone>", "type": "<host_type>", "subnetId": "<subnet_ID>", "assignPublicIp": <public_access_to_host>, "shardName": "<shard_name>" }, { <similar_configuration_for_host_2> }, { ... }, { <similar_configuration_for_host_N> } ], "deletionProtection": <deletion_protection> }
Where:
-
name
: Cluster name. -
environment
: Cluster environment,PRODUCTION
orPRESTABLE
. -
networkId
: ID of the network the cluster will be in. -
securityGroupIds
: Security group IDs as an array of strings. Each string is a security group ID. -
configSpec
: Cluster configuration:-
version
: ClickHouse® version, 24.3, 24.8, 24.9, or 24.10. -
embeddedKeeper
: Using ClickHouse® Keeper instead of ZooKeeper,true
orfalse
.This setting determines how replication will be managed in a cluster of multiple ClickHouse® hosts:
-
If
true
, the replication will be managed using ClickHouse® Keeper.Alert
You can't disable ClickHouse® Keeper after you create a cluster. ZooKeeper hosts will also become unavailable.
-
If undefined or
false
, the replication and query distribution will be managed using ZooKeeper.
-
-
clickhouse
: ClickHouse® configuration:resources.resourcePresetId
: Host class ID. You can request the list of available host classes with their IDs using the ResourcePreset.list method.resources.diskSize
: Disk size in bytes.resources.diskTypeId
: Disk type.
-
zookeeper
: ZooKeeper configuration.resources.resourcePresetId
: Host class ID. You can request the list of available host classes with their IDs using the ResourcePreset.list method.resources.diskSize
: Disk size in bytes.resources.diskTypeId
: Disk type.
If you enabled ClickHouse® Keeper with the help of the
embeddedKeeper: true
setting, you do not have to specify a ZooKeeper configuration inconfigSpec
as it will not be applied. -
access
: Settings enabling access to the cluster from other services and SQL queries from the management console using Yandex WebSQL:-
dataLens
: Enable access from DataLens,true
orfalse
. The default value isfalse
. For more information about setting up a connection, see Connecting from DataLens. -
webSql
: Enables SQL queries against cluster databases from the Yandex Cloud management console using Yandex WebSQL,true
orfalse
. The default value isfalse
. -
metrika
: Enables data import from AppMetrica to your cluster :true
orfalse
. The default value isfalse
. -
serverless
: Enable access to the cluster from Yandex Cloud Functions,true
orfalse
. The default value isfalse
. For more information about setting up access, see the Cloud Functions documentation. -
dataTransfer
: Enable access to the cluster from Yandex Data Transfer in Serverless mode,true
orfalse
. The default value isfalse
.This will enable you to connect to Yandex Data Transfer running in Kubernetes via a special network. As a result, other operations, e.g., transfer launch and deactivation, will run faster.
-
yandexQuery
: Enable access to the cluster from Yandex Query,true
orfalse
. This feature is at the Preview stage.Default value:false
.
-
-
cloudStorage
: Hybrid storage settings:-
enabled
: Enable hybrid storage in the cluster if it is disabled,true
orfalse
. Default value:false
(disabled).Note
Once enabled, hybrid storage cannot be disabled.
-
moveFactor
: Minimum share of free space in cluster storage. If the minimum share is below this value, the data will be moved to Yandex Object Storage.Minimum value:
0
; maximum value:1
; default:0.01
. -
dataCacheEnabled
: Allow temporary storage of files in cluster storage,true
orfalse
.Default value:
true
(enabled). -
dataCacheMaxSize
: Maximum cache size (in bytes) allocated in cluster storage for temporary file storage.Default value:
1073741824
(1 GB). -
preferNotToMerge
: Disable merging of data parts in cluster and object storage,true
orfalse
.To disable merging, set to
true
. To leave merging enabled, set tofalse
.
-
-
sql...
andadminPassword
: Group of settings for user and database management via SQL:adminPassword
:admin
user password.sqlUserManagement
: User management via SQL,true
orfalse
.sqlDatabaseManagement
: Database management via SQL,true
orfalse
. For that, you also need to enable user management through SQL.
Warning
You can't disable activated settings to manage users and databases via SQL. You can enable these as required later when editing cluster settings.
-
-
databaseSpecs
: Database settings as an array ofname
element parameters. Each parameter contains a name of a separate database. -
userSpecs
: User settings as an array of elements, one for each user. Each element has the following structure:-
name
: Username. It may contain Latin letters, numbers, hyphens, and underscores, and must start with a letter or underscore. -
password
: User password. The password must be from 8 to 128 characters long. -
permissions
: List of DBs the user must have access to.The list appears as an array of
databaseName
parameters. Each parameter contains the name of a separate database.
-
-
hostSpecs
: Cluster host settings as an array of elements, one for each host. Each element has the following structure:-
type
: Host type:CLICKHOUSE
orZOOKEEPER
.If you enabled ClickHouse® Keeper with the help of the
embeddedKeeper: true
setting, specify only ClickHouse® host settings inhostSpecs
. -
zoneId
: Availability zone. -
subnetId
: Subnet ID. -
shardName
: Shard name. The setting is only relevant forCLICKHOUSE
-type hosts. -
assignPublicIp
: Internet access to the host via a public IP address,true
orfalse
.
If you are creating a multi-host cluster without using ClickHouse® Keeper, the following rules apply to ZooKeeper hosts:
-
If the cluster cloud network has subnets in each availability zone, and ZooKeeper host settings are not specified, then one such host will automatically be added to each subnet.
-
If only some availability zones in the cluster's network have subnets, specify the ZooKeeper host settings explicitly.
-
-
deletionProtection
: Protect the cluster, its databases, and users against accidental deletion,true
orfalse
. The default value isfalse
.Enabled deletion protection will not prevent a manual connection with the purpose to delete database contents.
You can request the folder ID with the list of folders in the cloud.
-
-
Run this request:
curl \ --request POST \ --header "Authorization: Bearer $IAM_TOKEN" \ --header "Content-Type: application/json" \ --url 'https://mdb.api.cloud.yandex.net/managed-clickhouse/v1/clusters' \ --data '@body.json'
-
-
View the server response to make sure the request was successful.
-
Get an IAM token for API authentication and put it into the environment variable:
export IAM_TOKEN="<IAM_token>"
-
Clone the cloudapi
repository:cd ~/ && git clone --depth=1 https://github.com/yandex-cloud/cloudapi
Below, we assume the repository contents are stored in the
~/cloudapi/
directory. -
Use the ClusterService.Create call and send the following request, e.g., via gRPCurl
:-
Create a file named
body.json
and add the following contents to it:Note
This example does not use all available parameters.
{ "folder_id": "<folder_ID>", "name": "<cluster_name>", "environment": "<environment>", "network_id": "<network_ID>", "security_group_ids": [ "<security_group_1_ID>", "<security_group_2_ID>", ... "<security_group_N_ID>" ], "config_spec": { "version": "<ClickHouse®>_version", "embedded_keeper": <ClickHouse® Keeper_usage>, "clickhouse": { "resources": { "resource_preset_id": "<ClickHouse®>_host_class", "disk_size": "<storage_size_in_bytes>", "disk_type_id": "<disk_type>" } }, "zookeeper": { "resources": { "resource_preset_id": "<ZooKeeper>_host_class", "disk_size": "<storage_size_in_bytes>", "disk_type_id": "<disk_type>" } }, "access": { "data_lens": <access_from_DataLens>, "web_sql": <run_SQL_queries_from_management_console>, "metrika": <access_from_Metrica_and_AppMetrica>, "serverless": <access_from_Cloud_Functions>, "data_transfer": <access_from_Data_Transfer>, "yandex_query": <access_from_Yandex_Query> }, "cloud_storage": { "enabled": <hybrid_storage_use>, "move_factor": "<share_of_free_space>", "data_cache_enabled": <temporary_file_storage>, "data_cache_max_size": "<maximum_cache_size_for_file_storage>", "prefer_not_to_merge": <disabling_merge_of_data_parts> }, "admin_password": "<admin_user_password>", "sql_user_management": <user_management_via_SQL>, "sql_database_management": <database_management_via_SQL> }, "database_specs": [ { "name": "<DB_name>" }, { <similar_settings_for_database_2> }, { ... }, { <similar_settings_for_database_N> } ], "user_specs": [ { "name": "<username>", "password": "<user_password>", "permissions": [ { "database_name": "<DB_name>" } ] }, { <similar_configuration_for_user_2> }, { ... }, { <similar_configuration_for_user_N> } ], "host_specs": [ { "zone_id": "<availability_zone>", "type": "<host_type>", "subnet_id": "<subnet_ID>", "assign_public_ip": <public_access_to_host>, "shard_name": "<shard_name>" }, { <similar_configuration_for_host_2> }, { ... }, { <similar_configuration_for_host_N> } ], "deletion_protection": <deletion_protection> }
Where:
-
name
: Cluster name. -
environment
: Cluster environment,PRODUCTION
orPRESTABLE
. -
network_id
: ID of the network the cluster will be in. -
security_group_ids
: Security group IDs as an array of strings. Each string is a security group ID. -
config_spec
: Cluster configuration:-
version
: ClickHouse® version, 24.3, 24.8, 24.9, or 24.10. -
embedded_keeper
: Using ClickHouse® Keeper instead of ZooKeeper,true
orfalse
.This setting determines how replication will be managed in a cluster of multiple ClickHouse® hosts:
-
If
true
, the replication will be managed using ClickHouse® Keeper.Alert
You can't disable ClickHouse® Keeper after you create a cluster. ZooKeeper hosts will also become unavailable.
-
If undefined or
false
, the replication and query distribution will be managed using ZooKeeper.
-
-
clickhouse
: ClickHouse® configuration:resources.resource_preset_id
: Host class ID. You can request the list of available host classes with their IDs using the ResourcePreset.list method.resources.disk_size
: Disk size in bytes.resources.disk_type_id
: Disk type.
-
zookeeper
: ZooKeeper configuration.resources.resource_preset_id
: Host class ID. You can request the list of available host classes with their IDs using the ResourcePreset.list method.resources.disk_size
: Disk size in bytes.resources.disk_type_id
: Disk type.
If you enabled ClickHouse® Keeper with the help of the
embedded_keeper: true
setting, you do not have to specify a ZooKeeper configuration inconfig_spec
as it will not be applied. -
access
: Settings enabling access to the cluster from other services and SQL queries from the management console using Yandex WebSQL:-
data_lens
: Enable access from DataLens,true
orfalse
. The default value isfalse
. For more information about setting up a connection, see Connecting from DataLens. -
web_sql
: Enables SQL queries against cluster databases from the Yandex Cloud management console using Yandex WebSQL:true
orfalse
. The default value isfalse
. -
metrika
: Enables data import from AppMetrica to your cluster :true
orfalse
. The default value isfalse
. -
serverless
: Enable access to the cluster from Yandex Cloud Functions,true
orfalse
. The default value isfalse
. For more information about setting up access, see the Cloud Functions documentation. -
data_transfer
: Enable access to the cluster from Yandex Data Transfer in Serverless mode,true
orfalse
. The default value isfalse
.This will enable you to connect to Yandex Data Transfer running in Kubernetes via a special network. As a result, other operations, e.g., transfer launch and deactivation, will run faster.
-
yandex_query
: Enable access to the cluster from Yandex Query,true
orfalse
:true
orfalse
. This feature is at the Preview stage.Default value:false
.
-
-
cloud_storage
: Hybrid storage settings:-
enabled
: Enable hybrid storage in the cluster if it is disabled,true
orfalse
. Default value:false
(disabled).Note
Once enabled, hybrid storage cannot be disabled.
-
move_factor
: Minimum share of free space in cluster storage. If the minimum share is below this value, the data will be moved to Yandex Object Storage.Minimum value:
0
; maximum value:1
; default:0.01
. -
data_cache_enabled
: Allow temporary storage of files in cluster storage,true
orfalse
.Default value:
true
(enabled). -
data_cache_max_size
: Maximum cache size (in bytes) allocated in cluster storage for temporary file storage.Default value:
1073741824
(1 GB). -
prefer_not_to_merge
: Disable merging of data parts in cluster and object storage,true
orfalse
.To disable merging, set to
true
. To leave merging enabled, set tofalse
.
-
-
sql...
andadmin_password
: Group of settings for user and database management via SQL:admin_password
:admin
user password.sql_user_management
: User management via SQL,true
orfalse
.sql_database_management
: Database management via SQL,true
orfalse
. For that, you also need to enable user management through SQL.
Warning
You can't disable activated settings to manage users and databases via SQL. You can enable these as required later when editing cluster settings.
-
-
database_specs
: Database settings as an array ofname
element parameters. Each parameter contains the name of a separate database. -
user_specs
: User settings as an array of elements, one for each user. Each element has the following structure:-
name
: Username. It may contain Latin letters, numbers, hyphens, and underscores, and must start with a letter or underscore. -
password
: User password. The password must be from 8 to 128 characters long. -
permissions
: List of DBs the user must have access to.The list appears as an array of
database_name
parameters. Each parameter contains the name of a separate database.
-
-
host_specs
: Cluster host settings as an array of elements, one for each host. Each element has the following structure:-
type
: Host type:CLICKHOUSE
orZOOKEEPER
.If you enabled ClickHouse® Keeper with the help of the
embedded_keeper: true
setting, specify only ClickHouse® host settings inhost_specs
. -
zone_id
: Availability zone. -
subnet_id
: Subnet ID. -
shard_name
: Shard name. The setting is only relevant forCLICKHOUSE
-type hosts. -
assign_public_ip
: Internet access to the host via a public IP address,true
orfalse
.
If you are creating a multi-host cluster without using ClickHouse® Keeper, the following rules apply to ZooKeeper hosts:
-
If the cluster cloud network has subnets in each availability zone, and ZooKeeper host settings are not specified, then one such host will automatically be added to each subnet.
-
If only some availability zones in the cluster's network have subnets, specify the ZooKeeper host settings explicitly.
-
-
deletion_protection
: Protect the cluster, its databases, and users against accidental deletion,true
orfalse
. The default value isfalse
.Enabled deletion protection will not prevent a manual connection with the purpose to delete database contents.
You can request the folder ID with the list of folders in the cloud.
-
-
Run this request:
grpcurl \ -format json \ -import-path ~/cloudapi/ \ -import-path ~/cloudapi/third_party/googleapis/ \ -proto ~/cloudapi/yandex/cloud/mdb/clickhouse/v1/cluster_service.proto \ -rpc-header "Authorization: Bearer $IAM_TOKEN" \ -d @ \ mdb.api.cloud.yandex.net:443 \ yandex.cloud.mdb.clickhouse.v1.ClusterService.Create \ < body.json
-
-
View the server response to make sure the request was successful.
Warning
If you specified security group IDs when creating a cluster, you may also need to additionally configure security groups to connect to the cluster.
Creating a cluster copy
You can create a ClickHouse® cluster with the settings of another one you previously created. To do so, you need to import the configuration of the source ClickHouse® cluster to Terraform. This way you can either create an identical copy or use the imported configuration as the baseline and modify it as needed. Importing a configuration is a good idea when the source ClickHouse® cluster has a lot of settings and you need to create a similar one.
To create a ClickHouse® cluster copy:
-
If you do not have Terraform yet, install it.
-
Get the authentication credentials. You can add them to environment variables or specify them later in the provider configuration file.
-
Configure and initialize a provider. There is no need to create a provider configuration file manually, you can download it
. -
Place the configuration file in a separate working directory and specify the parameter values. If you did not add the authentication credentials to environment variables, specify them in the configuration file.
-
In the same working directory, place a
.tf
file with the following contents:resource "yandex_mdb_clickhouse_cluster" "old" { }
-
Write the ID of the initial ClickHouse® cluster to the environment variable:
export CLICKHOUSE_CLUSTER_ID=<cluster_ID>
You can request the ID with the list of clusters in the folder.
-
Import the settings of the initial ClickHouse® cluster into the Terraform configuration:
terraform import yandex_mdb_clickhouse_cluster.old ${CLICKHOUSE_CLUSTER_ID}
-
Get the imported configuration:
terraform show
-
Copy it from the terminal and paste it into the
.tf
file. -
Place the file in the new
imported-cluster
directory. -
Modify the copied configuration so that you can create a new cluster from it:
- Specify the new cluster name in the
resource
string and thename
parameter. - Delete the
created_at
,health
,id
, andstatus
parameters. - In the
host
sections, delete thefqdn
parameters. - If the
clickhouse.config.merge_tree
section hasmax_parts_in_total = 0
, delete this parameter. - If the
maintenance_window
section hastype = "ANYTIME"
, delete thehour
parameter. - If there are
user
sections, add thename
andpassword
parameters to them. - Optionally, make further changes if you need to customize the configuration.
- Specify the new cluster name in the
-
Get the authentication credentials in the
imported-cluster
directory. -
In the same directory, configure and initialize a provider. There is no need to create a provider configuration file manually, you can download it
. -
Place the configuration file in the
imported-cluster
directory and specify the parameter values. If you did not add the authentication credentials to environment variables, specify them in the configuration file. -
Check that the Terraform configuration files are correct:
terraform validate
If there are any errors in the configuration files, Terraform will point them out.
-
Create the required infrastructure:
-
Run the command to view planned changes:
terraform plan
If the resource configuration descriptions are correct, the terminal will display a list of the resources to modify and their parameters. This is a test step. No resources are updated.
-
If you are happy with the planned changes, apply them:
-
Run the command:
terraform apply
-
Confirm the update of resources.
-
Wait for the operation to complete.
-
All the required resources will be created in the specified folder. You can check resource availability and their settings in the management console
. -
Time limits
A Terraform provider sets the timeout for Managed Service for ClickHouse® cluster operations:
- Creating a cluster, including by restoring one from a backup: 60 minutes.
- Editing a cluster: 90 minutes.
- Deleting a cluster: 30 minutes.
Operations exceeding the set timeout are interrupted.
How do I change these limits?
Add the timeouts
block to the cluster description, for example:
resource "yandex_mdb_clickhouse_cluster" "<cluster_name>" {
...
timeouts {
create = "1h30m" # 1 hour 30 minutes
update = "2h" # 2 hours
delete = "30m" # 30 minutes
}
}
Examples
Creating a single-host cluster
To create a cluster with a single host, provide a single --host
parameter.
Create a Managed Service for ClickHouse® cluster with the following test specifications:
- Name:
mych
. - Environment:
production
. - Network:
default
. - Security group:
enp6saqnq4ie244g67sb
. - Number of ClickHouse® hosts of the
s2.micro
class in theb0rcctk2rvtr********
subnet in theru-central1-a
availability zone: 1. - ClickHouse® Keeper.
- Network SSD storage (
network-ssd
): 20 GB. - User:
user1
with passworduser1user1
. - Database:
db1
. - Protection against accidental cluster deletion: Enabled.
Run the following command:
yc managed-clickhouse cluster create \
--name mych \
--environment=production \
--network-name default \
--clickhouse-resource-preset s2.micro \
--host type=clickhouse,zone-id=ru-central1-a,subnet-id=b0cl69g98qum******** \
--embedded-keeper true \
--clickhouse-disk-size 20 \
--clickhouse-disk-type network-ssd \
--user name=user1,password=user1user1 \
--database name=db1 \
--security-group-ids enp6saqnq4ie244g67sb \
--deletion-protection
Create a Managed Service for ClickHouse® cluster and a network for it with the following test specifications:
-
Name:
mych
. -
Environment:
PRESTABLE
. -
Cloud ID:
b1gq90dgh25bebiu75o
. -
Folder ID:
b1gia87mbaomkfvsleds
. -
New cloud network:
cluster-net
. -
New default security group:
cluster-sg
(in thecluster-net
network). It must allow connections to any cluster host from any network (including the internet) on ports8443
and9440
. -
One
s2.micro
class hosts in a new subnet namedcluster-subnet-ru-central1-a
.Subnet parameters:
- Address range:
172.16.1.0/24
. - Network:
cluster-net
. - Availability zone:
ru-central1-a
.
- Address range:
-
Network SSD storage (
network-ssd
): 32 GB. -
Database name:
db1
. -
User:
user1
with passworduser1user1
.
The configuration files for this cluster are as follows:
-
Configuration file with a description of provider settings:
provider.tf
terraform { required_providers { yandex = { source = "yandex-cloud/yandex" } } } provider "yandex" { token = "<service_account_OAuth_or_static_key>" cloud_id = "b1gq90dgh25bebiu75o" folder_id = "b1gia87mbaomkfvsleds" }
To get an OAuth token or a static access key, see the Yandex Identity and Access Management instructions.
-
Configuration file with a description of the cloud network and subnet:
networks.tf
resource "yandex_vpc_network" "cluster-net" { name = "cluster-net" } resource "yandex_vpc_subnet" "cluster-subnet-a" { name = "cluster-subnet-ru-central1-a" zone = "ru-central1-a" network_id = yandex_vpc_network.cluster-net.id v4_cidr_blocks = ["172.16.1.0/24"] }
-
Configuration file with a description of the security group:
security-groups.tf
resource "yandex_vpc_default_security_group" "cluster-sg" { network_id = yandex_vpc_network.cluster-net.id ingress { description = "HTTPS (secure)" port = 8443 protocol = "TCP" v4_cidr_blocks = ["0.0.0.0/0"] } ingress { description = "clickhouse-client (secure)" port = 9440 protocol = "TCP" v4_cidr_blocks = ["0.0.0.0/0"] } egress { description = "Allow all egress cluster traffic" protocol = "TCP" v4_cidr_blocks = ["0.0.0.0/0"] } }
-
Configuration file with a description of the cluster and cluster host:
cluster.tf
resource "yandex_mdb_clickhouse_cluster" "mych" { name = "mych" environment = "PRESTABLE" network_id = yandex_vpc_network.cluster-net.id security_group_ids = [yandex_vpc_default_security_group.cluster-sg.id] clickhouse { resources { resource_preset_id = "s2.micro" disk_type_id = "network-ssd" disk_size = 32 } } host { type = "CLICKHOUSE" zone = "ru-central1-a" subnet_id = yandex_vpc_subnet.cluster-subnet-a.id } database { name = "db1" } user { name = "user1" password = "user1user1" permission { database_name = "db1" } } }
Creating a multi-host cluster
Create a Managed Service for ClickHouse® cluster with the following test specifications:
-
Name:
mych
. -
Environment:
PRESTABLE
. -
Cloud ID:
b1gq90dgh25bebiu75o
. -
Folder ID:
b1gia87mbaomkfvsleds
. -
New cloud network:
cluster-net
. -
Three ClickHouse® hosts of the
s2.micro
class and three ZooKeeper hosts of theb2.medium
class (to ensure replication).One host of each class will be added to the new subnets:
cluster-subnet-ru-central1-a
:172.16.1.0/24
, availability zone:ru-central1-a
.cluster-subnet-ru-central1-b
:172.16.2.0/24
, availability zone:ru-central1-b
.cluster-subnet-ru-central1-d
:172.16.3.0/24
, availability zone:ru-central1-d
.
These subnets will belong to the
cluster-net
network. -
New default security group:
cluster-sg
(in thecluster-net
network). It must allow connections to any cluster host from any network (including the internet) on ports8443
and9440
. -
Local SSD storage (
network-ssd
) for each of the cluster's ClickHouse® hosts: 32 GB. -
Local SSD storage (
network-ssd
) for each of the cluster's ZooKeeper hosts: 10 GB. -
Database name:
db1
. -
User:
user1
with passworduser1user1
.
The configuration files for this cluster are as follows:
-
Configuration file with a description of provider settings:
provider.tf
terraform { required_providers { yandex = { source = "yandex-cloud/yandex" } } } provider "yandex" { token = "<service_account_OAuth_or_static_key>" cloud_id = "b1gq90dgh25bebiu75o" folder_id = "b1gia87mbaomkfvsleds" }
To get an OAuth token or a static access key, see the Yandex Identity and Access Management instructions.
-
Configuration file with a description of the cloud network and subnets:
networks.tf
resource "yandex_vpc_network" "cluster-net" { name = "cluster-net" } resource "yandex_vpc_subnet" "cluster-subnet-a" { name = "cluster-subnet-ru-central1-a" zone = "ru-central1-a" network_id = yandex_vpc_network.cluster-net.id v4_cidr_blocks = ["172.16.1.0/24"] } resource "yandex_vpc_subnet" "cluster-subnet-b" { name = "cluster-subnet-ru-central1-b" zone = "ru-central1-b" network_id = yandex_vpc_network.cluster-net.id v4_cidr_blocks = ["172.16.2.0/24"] } resource "yandex_vpc_subnet" "cluster-subnet-d" { name = "cluster-subnet-ru-central1-d" zone = "ru-central1-d" network_id = yandex_vpc_network.cluster-net.id v4_cidr_blocks = ["172.16.3.0/24"] }
-
Configuration file with a description of the security group:
security-groups.tf
resource "yandex_vpc_default_security_group" "cluster-sg" { network_id = yandex_vpc_network.cluster-net.id ingress { description = "HTTPS (secure)" port = 8443 protocol = "TCP" v4_cidr_blocks = ["0.0.0.0/0"] } ingress { description = "clickhouse-client (secure)" port = 9440 protocol = "TCP" v4_cidr_blocks = ["0.0.0.0/0"] } egress { description = "Allow all egress cluster traffic" protocol = "TCP" v4_cidr_blocks = ["0.0.0.0/0"] } }
-
Configuration file with a description of the cluster and cluster hosts:
cluster.tf
resource "yandex_mdb_clickhouse_cluster" "mych" { name = "mych" environment = "PRESTABLE" network_id = yandex_vpc_network.cluster-net.id security_group_ids = [yandex_vpc_default_security_group.cluster-sg.id] clickhouse { resources { resource_preset_id = "s2.micro" disk_type_id = "network-ssd" disk_size = 32 } } host { type = "CLICKHOUSE" zone = "ru-central1-a" subnet_id = yandex_vpc_subnet.cluster-subnet-a.id } host { type = "CLICKHOUSE" zone = "ru-central1-b" subnet_id = yandex_vpc_subnet.cluster-subnet-b.id } host { type = "CLICKHOUSE" zone = "ru-central1-d" subnet_id = yandex_vpc_subnet.cluster-subnet-d.id } zookeeper { resources { resource_preset_id = "b2.medium" disk_type_id = "network-ssd" disk_size = 10 } } host { type = "ZOOKEEPER" zone = "ru-central1-a" subnet_id = yandex_vpc_subnet.cluster-subnet-a.id } host { type = "ZOOKEEPER" zone = "ru-central1-b" subnet_id = yandex_vpc_subnet.cluster-subnet-b.id } host { type = "ZOOKEEPER" zone = "ru-central1-d" subnet_id = yandex_vpc_subnet.cluster-subnet-d.id } database { name = "db1" } user { name = "user1" password = "user1user1" permission { database_name = "db1" } } }
Managing database connection parameters using Connection Manager
If your cloud or folder has access to Connection Manager public preview, a new connection entity will appear in your folder after you create a cluster. You can use it to manage database connection parameters.
Passwords and other sensitive data will be stored in a Yandex Lockbox secret. To see which secrets store connection information for your cluster, select Lockbox in the list of services in your folder. You will find you cluster's ID on the Secrets page in the secret dependencies column.
You can also use Connection Manager to configure access to connections.
ClickHouse® is a registered trademark of ClickHouse, Inc