Migrating data to Managed Service for ClickHouse® using ClickHouse®
You can migrate data from your ClickHouse® cluster to a Managed Service for ClickHouse® cluster by using:
- Built-in
remotefunction. This method is suitable for migrating individual tables. - Built-in backup/restore commands and an Yandex Object Storage bucket. This method can be used to migrate individual tables as well as an entire database.
You can also migrate a database from a ClickHouse® cluster to a Managed Service for ClickHouse® cluster using Data Transfer. For more information about this method, see this tutorial.
Data migration using the remote function
You can use remote to migrate individual tables from a third-party ClickHouse® cluster. This method does not require installing ZooKeeper, additional tools, or upgrading the source cluster's ClickHouse® version.
Tip
Before migrating data, we recommend pausing merge operations on the source cluster using the STOP MERGES and STOP TTL MERGES commands. You should also disable all consumers.
To migrate a table from a third-party ClickHouse® cluster to a Managed Service for ClickHouse® cluster:
-
Create a Managed Service for ClickHouse® target cluster.
-
Connect to the source cluster.
-
Get a DDL statement for table creation:
SELECT create_table_query FROM system.tables WHERE database = '<name_of_DB_to_migrate>';For example, you might have a database named
db1with a list of tasks stored in a table namedtasks. The response to your query will look as follows:CREATE TABLE db1.tasks (`task_id` Int32, `title` String, `start_date` Date, `due_date` Date, `priority` Int8 DEFAULT 3, `description` String) ENGINE = MergeTree PRIMARY KEY tuple(task_id) ORDER BY tuple(task_id) SETTINGS index_granularity = 8192; -
Connect to the target Managed Service for ClickHouse® cluster and create a new table using the DDL statement you received earlier.
If the source cluster did not use replicas, while the target cluster does, switch to a Replicated table engine.
To create an object on all hosts of the target cluster, use the
ON CLUSTERclause in theCREATEcommand. -
Run the following query on the source cluster:
INSERT INTO FUNCTION remoteSecure('<Managed_Service_for_ClickHouse®_cluster_host_FQDN>:9440', '<target_cluster_DB_name>.<target_table_name>', '<username_in_target_cluster>', '<user_password_in_target_cluster>') SELECT * from <DB_name>.<table_name>;To learn how to get a host FQDN, see this guide.
-
On the target cluster, verify that the table from the source cluster has appeared in the database:
SHOW TABLES FROM <DB_name>; -
Check that the target table contains the same data as the source table:
SELECT * FROM <DB_name>.<table_name>;
For more details on using the remote function, see this ClickHouse® article
Migrating data using backup and restore commands with an Object Storage bucket
Warning
You need ClickHouse® version 22.10 or later to work with backup and restore commands in a third-party cluster.
You can use backup and restore commands with an Object Storage bucket to migrate individual tables or an entire database from a third-party ClickHouse® cluster. To do this:
-
Create a Managed Service for ClickHouse® target cluster.
-
Create a service account with the
storage.editorrole. -
Create a static key for the service account.
Save both the key and its ID, as you will need them in the next steps.
-
Create an Object Storage bucket.
-
If encryption is enabled for the bucket, assign to the service account the kms.keys.encrypterDecrypter role for the encryption key linked to the bucket.
-
Connect to the source cluster.
-
Run the following command to back up your table to the Object Storage bucket:
BACKUP TABLE <DB_name>.<table_name> TO S3('<Object_Storage_bucket_endpoint>', '<service_account_static_key_ID>', '<service_account_static_key>');To migrate the entire database, run the command below:
BACKUP DATABASE <DB_name> TO S3('<Object_Storage_bucket_endpoint>', '<service_account_static_key_ID>', '<service_account_static_key>'); -
Connect to the target Managed Service for ClickHouse® cluster.
-
Run the following command to restore your table from a backup:
RESTORE TABLE <DB_name>.<table_name> FROM S3('<Object_Storage_bucket_endpoint>', '<service_account_static_key_ID>', 'service_account_static_key>');To restore an entire database, use this command:
RESTORE DATABASE <DB_name> FROM S3('<Object_Storage_bucket_endpoint>', '<service_account_static_key_ID>', 'service_account_static_key>'); -
Verify that the restore operation was successful:
-
If you restored a table, run this command:
SELECT * FROM <DB_name>.<table_name>; -
If you restored the entire database, run this command:
SHOW DATABASES;
-
For more details on using backup and restore commands with an S3 storage, see this ClickHouse® article