Copying data from Managed Service for OpenSearch to Managed Service for ClickHouse® using Yandex Data Transfer
With Data Transfer, you can transfer data from a Managed Service for OpenSearch source cluster to Managed Service for ClickHouse®.
To transfer data:
If you no longer need the resources you created, delete them.
Getting started
Prepare the data transfer infrastructure:
-
Create a Managed Service for OpenSearch source cluster in any suitable configuration with publicly available hosts.
-
In the same availability zone, create a Managed Service for ClickHouse® target cluster in any suitable configuration with publicly available hosts.
If you are going to connect to the cluster via Yandex WebSQL, enable WebSQL access in the cluster settings.
-
Get an SSL certificate to connect to the Managed Service for OpenSearch cluster.
-
Make sure that security groups of the Managed Service for OpenSearch and Managed Service for ClickHouse® clusters allow connecting from the internet.
-
If you do not have Terraform yet, install it.
-
Get the authentication credentials. You can add them to environment variables or specify them later in the provider configuration file.
-
Configure and initialize a provider. There is no need to create a provider configuration file manually, you can download it
. -
Place the configuration file in a separate working directory and specify the parameter values. If you did not add the authentication credentials to environment variables, specify them in the configuration file.
-
Download the opensearch-to-clickhouse.tf
configuration file to the same working directory.This file describes:
- Network.
- Subnet.
- Security group and rules required to connect to the Managed Service for OpenSearch and Managed Service for ClickHouse® clusters.
- Managed Service for OpenSearch source cluster with the
admin
user. - Managed Service for ClickHouse® target cluster with a user and database.
- Target endpoint.
- Transfer.
-
In the
opensearch-to-clickhouse.tf
file, specify the following parameters:-
source_admin_password
:admin
user password in Managed Service for OpenSearch cluster. -
mos_version
: OpenSearch version. -
mch_db_name
: Database name in Managed Service for ClickHouse® cluster. -
mch_username
: Username in Managed Service for ClickHouse® cluster. -
mch_user_password
: User password in Managed Service for ClickHouse® cluster. -
source_endpoint_id
: Source endpoint ID. -
profile_name
: Your YC CLI profile name.If you do not have the Yandex Cloud command line interface yet, install and initialize it.
-
-
Make sure the Terraform configuration files are correct using this command:
terraform validate
If there are any errors in the configuration files, Terraform will point them out.
-
Create the required infrastructure:
-
Run the command to view planned changes:
terraform plan
If the resource configuration descriptions are correct, the terminal will display a list of the resources to modify and their parameters. This is a test step. No resources are updated.
-
If you are happy with the planned changes, apply them:
-
Run the command:
terraform apply
-
Confirm the update of resources.
-
Wait for the operation to complete.
-
All the required resources will be created in the specified folder. You can check resource availability and their settings in the management console
. -
Prepare the test data
-
In the source cluster, create a test index named
people
and set its schema:curl --cacert ~/.opensearch/root.crt \ --user <source_cluster_username>:<user_password_in_source_cluster> \ --header 'Content-Type: application/json' \ --request PUT 'https://<address_of_OpenSearch_host_with_DATA_role>:9200/people' && \ curl --cacert ~/.opensearch/root.crt \ --user <source_cluster_username>:<user_password_in_source_cluster> \ --header 'Content-Type: application/json' \ --request PUT 'https://<address_of_OpenSearch_host_with_DATA_role>:9200/people/_mapping?pretty' \ --data' { "properties": { "name": {"type": "text"}, "age": {"type": "integer"} } } '
-
Populate the test index with data:
curl --cacert ~/.opensearch/root.crt \ --user <source_cluster_username>:<user_password_in_source_cluster> \ --header 'Content-Type: application/json' \ --request POST 'https://<address_of_OpenSearch_host_with_DATA_role>:9200/people/_doc/?pretty' \ --data' { "name": "Alice", "age": "30" } ' && \ curl --cacert ~/.opensearch/root.crt \ --user <source_cluster_username>:<user_password_in_source_cluster> \ --header 'Content-Type: application/json' \ --request POST 'https://<address_of_OpenSearch_host_with_DATA_role>:9200/people/_doc/?pretty' \ --data' { "name": "Robert", "age": "32" } '
-
(Optional) Check the data in the test index:
curl --cacert ~/.opensearch/root.crt \ --user <username_in_target_cluster>:<user_password_in_target_cluster> \ --header 'Content-Type: application/json' \ --request GET 'https://<address_of_OpenSearch_host_with_DATA_role>:9200/people/_search?pretty'
Prepare and activate the transfer
-
Create a source endpoint for the Managed Service for OpenSearch cluster you created earlier with the following settings:
- Database type:
OpenSearch
. - Connection settings:
- Connection type:
Managed Service for OpenSearch cluster
. - Managed Service for OpenSearch cluster: Select the Managed Service for OpenSearch cluster from the list.
- User:
admin
. - Password:
admin
user password.
- Connection type:
- Database type:
-
Create a target endpoint and a transfer:
ManuallyTerraform-
- Database type:
ClickHouse
. - Endpoint parameters:
- Connection type: Select
Managed cluster
. - Managed cluster: Select a Managed Service for ClickHouse® cluster from the list.
- User: Enter a name for the Managed Service for ClickHouse® cluster user.
- Password: Enter a password for the Managed Service for ClickHouse® cluster user.
- Database: Enter a name for the Managed Service for ClickHouse® cluster database.
- Connection type: Select
- Database type:
-
Create a transfer of the Snapshot type that will use the created endpoints.
-
Activate the transfer.
-
In the
opensearch-to-clickhouse.tf
file, specify the following parameter values:source_endpoint_id
: ID of the source endpoint.transfer_enabled
:1
for creating a target endpoint and transfer.
-
Make sure the Terraform configuration files are correct using this command:
terraform validate
If there are any errors in the configuration files, Terraform will point them out.
-
Create the required infrastructure:
-
Run the command to view planned changes:
terraform plan
If the resource configuration descriptions are correct, the terminal will display a list of the resources to modify and their parameters. This is a test step. No resources are updated.
-
If you are happy with the planned changes, apply them:
-
Run the command:
terraform apply
-
Confirm the update of resources.
-
Wait for the operation to complete.
-
Once created, your transfer will be activated automatically.
-
-
Test the transfer
-
Wait for the transfer status to change to Completed.
-
Make sure the data from the source Managed Service for OpenSearch cluster has been moved to the Managed Service for ClickHouse® database:
Yandex WebSQLCLI-
Create a connection to the Managed Service for ClickHouse® cluster database.
-
Make sure the database contains the
people
table with test data. To do this, run this query to the database via the connection you created:SELECT * FROM people;
-
Get an SSL certificate to connect to the Managed Service for ClickHouse® cluster.
-
If you do not have
clickhouse-client
, install it. -
Connect to the database in the Managed Service for ClickHouse® cluster.
-
Make sure the database contains the
people
table with test data:SELECT * FROM people;
-
Delete the resources you created
Some resources are not free of charge. To avoid paying for them, delete the resources you no longer need:
-
Delete other resources depending on how they were created:
ManuallyTerraform-
Delete the Managed Service for ClickHouse® cluster.
The connection to the Managed Service for ClickHouse® cluster database in Yandex WebSQL will be deleted automatically.
-
In the terminal, go to the working directory with the
opensearch-to-clickhouse.tf
configuration file. -
Delete the resources using this command:
terraform destroy
-
Type
yes
and press Enter.All the resources described in the
opensearch-to-clickhouse.tf
configuration file will be deleted.
ClickHouse® is a registered trademark of ClickHouse, Inc