Migrating data to Managed Service for ClickHouse® using ClickHouse®
You can migrate data from your ClickHouse® cluster to a Managed Service for ClickHouse® cluster by using:
- Built-in
remote
function. This method is suitable for migrating individual tables. - Built-in backup/restore commands and an Yandex Object Storage bucket. This method is suitable for migrating individual tables or a whole database.
You can also migrate a database from a ClickHouse® cluster to a Managed Service for ClickHouse® cluster with the help of Data Transfer. For more information about how to do this, see this tutorial.
remote
Migrating data using You can use remote
to migrate individual tables from a third-party ClickHouse® cluster. This method does not require installing ZooKeeper and additional tools, nor upgrading the source cluster's ClickHouse® version.
Tip
Before migrating data, we recommend you to stop ongoing merges into the source cluster using the STOP MERGES
and STOP TTL MERGES
commands and to disable consumers.
To migrate a table from a third-party ClickHouse® cluster to a Managed Service for ClickHouse® cluster:
-
Create a Managed Service for ClickHouse® target cluster.
-
Connect to the cluster you want to migrate the data from.
-
Get the text of the table creation query:
SELECT create_table_query FROM system.tables WHERE database = '<name_of_DB_to_migrate>';
For example, your DB named
db1
stores a table namedtasks
with the list of tasks. The response to your query will look as follows:CREATE TABLE db1.tasks (`task_id` Int32, `title` String, `start_date` Date, `due_date` Date, `priority` Int8 DEFAULT 3, `description` String) ENGINE = MergeTree PRIMARY KEY tuple(task_id) ORDER BY tuple(task_id) SETTINGS index_granularity = 8192;
-
Connect to the Managed Service for ClickHouse® cluster you want to migrate the data to and create a new table on it from the query text you received earlier.
If the source cluster did not use replicas, while the target cluster does, switch to a Replicated family engine.
To create an object on all the target cluster hosts, use the
ON CLUSTER
expression in theCREATE
command. -
Run the following query in the source cluster:
INSERT INTO FUNCTION remoteSecure('<Managed_Service_for_ClickHouse®_cluster_host_FQDN>:9440', '<target_cluster_DB_name>.<target_table_name>', '<username_in_target_cluster>', '<user_password_in_target_cluster>') SELECT * from <DB_name>.<table_name>;
To learn how to get host FQDN, see this guide.
-
In the target cluster, check that the table from the source cluster has appeared in the DB:
SHOW TABLES FROM <DB_name>;
-
Check that the table contains data from the source cluster table:
SELECT * FROM <DB_name>.<table_name>;
For more information about using the remote
function, see the ClickHouse® documentation
Migrating data using the backup/restore commands and an Object Storage bucket
Warning
You need ClickHouse® version 22.10 or later to work with the backup/restore commands in a third-party cluster.
You can use the backup/restore commands and an Object Storage bucket to migrate individual tables or the whole database from a third-party ClickHouse® cluster. To do this:
-
Create a Managed Service for ClickHouse® target cluster.
-
Create a service account with the
storage.editor
role. -
Create a static key for the service account.
Save the key ID and the key itself: you will need them for the next steps.
-
Create an Object Storage bucket.
-
Connect to the cluster you want to migrate the data from.
-
Run this command to save a table's backup to an Object Storage bucket:
BACKUP TABLE <DB_name>.<table_name> TO S3('<Object_Storage_bucket_endpoint>', '<service_account_static_key_ID>', '<service_account_static_key>');
If you want to migrate the whole database, run this command:
BACKUP DATABASE <DB_name> TO S3('<Object_Storage_bucket_endpoint>', '<service_account_static_key_ID>', '<service_account_static_key>');
-
Connect to the Managed Service for ClickHouse® cluster you want to migrate the data to.
-
Run this command to restore your table from a backup:
RESTORE TABLE <DB_name>.<table_name> FROM S3('<Object_Storage_bucket_endpoint>', '<service_account_static_key_ID>', 'service_account_static_key>');
If you want to restore a database, use this command:
RESTORE DATABASE <DB_name> FROM S3('<Object_Storage_bucket_endpoint>', '<service_account_static_key_ID>', 'service_account_static_key>');
-
Make sure restoring from backup was successful:
-
If it was a table you restored, run this command:
SELECT * FROM <DB_name>.<table_name>;
-
If it was a database you restored, run this command:
SHOW DATABASES;
-
For more information on using the backup/restore commands with an S3 storage, see the ClickHouse® documentation