Copying data from Managed Service for OpenSearch to Managed Service for Greenplum® using Yandex Data Transfer
With Data Transfer, you can transfer data from a Managed Service for OpenSearch source cluster to a Managed Service for Greenplum® target cluster.
To transfer data:
- Get your cloud ready.
- Set up your infrastructure.
- Prepare the test data.
- Prepare and activate your transfer.
- Test the transfer.
If you no longer need the resources you created, delete them.
Get your cloud ready
Sign up for Yandex Cloud and create a billing account:
- Go to the management console
and log in to Yandex Cloud or create an account if you do not have one yet. - On the Yandex Cloud Billing
page, make sure you have a billing account linked and it has theACTIVE
orTRIAL_ACTIVE
status. If you do not have a billing account, create one.
If you have an active billing account, you can go to the cloud page
Learn more about clouds and folders.
Required paid resources
The infrastructure support costs include:
- Fee for Managed Service for OpenSearch cluster computing resources and storage volume (see Managed Service for OpenSearch pricing).
- Fee for Managed Service for Greenplum® cluster computing resources, storage volume, and backups (see Managed Service for Greenplum® pricing).
Set up your infrastructure
-
If you do not have Terraform yet, install it.
-
Get the authentication credentials. You can add them to environment variables or specify them later in the provider configuration file.
-
Configure and initialize a provider. There is no need to create a provider configuration file manually, you can download it
. -
Place the configuration file in a separate working directory and specify the parameter values. If you did not add the authentication credentials to environment variables, specify them in the configuration file.
-
Download the opensearch-to-greenplum.tf
configuration file to the same working directory.This file describes:
- Network
- Subnet.
- Security group and rules required to connect to the Managed Service for OpenSearch and Managed Service for Greenplum® clusters.
- Managed Service for OpenSearch source cluster with the
admin
user. - Managed Service for Greenplum® target cluster.
- Transfer.
-
In the
opensearch-to-greenplum.tf
file, specify the following settings:-
mos_cluster_name
: Managed Service for OpenSearch cluster name. -
source_admin_password
:admin
user password in Managed Service for OpenSearch cluster. -
mgp_cluster_name
: Managed Service for Greenplum® cluster name. -
mgp_username
: Username in Managed Service for Greenplum® cluster. -
mgp_user_password
: User password in Managed Service for Greenplum® cluster. -
transfer_name
: Data Transfer transfer name. -
profile_name
: Your YC CLI profile name.If you do not have the Yandex Cloud CLI yet, install and initialize it.
-
-
Make sure the Terraform configuration files are correct using this command:
terraform validate
If there are any errors in the configuration files, Terraform will point them out.
-
Create the required infrastructure:
-
Run this command to view the planned changes:
terraform plan
If you described the configuration correctly, the terminal will display a list of the resources to update and their parameters. This is a verification step that does not apply changes to your resources.
-
If everything looks correct, apply the changes:
-
Run this command:
terraform apply
-
Confirm updating the resources.
-
Wait for the operation to complete.
-
All the required resources will be created in the specified folder. You can check resource availability and their settings in the management console
. -
Prepare the test data
-
In the source cluster, create a test index named
people
and set its schema:curl --cacert ~/.opensearch/root.crt \ --user <source_cluster_username>:<user_password_in_source_cluster> \ --header 'Content-Type: application/json' \ --request PUT 'https://<address_of_OpenSearch_host_with_DATA_role>:9200/people' && \ curl --cacert ~/.opensearch/root.crt \ --user <source_cluster_username>:<user_password_in_source_cluster> \ --header 'Content-Type: application/json' \ --request PUT 'https://<address_of_OpenSearch_host_with_DATA_role>:9200/people/_mapping?pretty' -d' { "properties": { "name": {"type": "text"}, "age": {"type": "integer"} } } '
-
Populate the test index with data:
curl --cacert ~/.opensearch/root.crt \ --user <source_cluster_username>:<user_password_in_source_cluster> \ --header 'Content-Type: application/json' \ --request POST 'https://<address_of_OpenSearch_host_with_DATA_role>:9200/people/_doc/?pretty' -d' { "name": "Alice", "age": "30" } ' && \ curl --cacert ~/.opensearch/root.crt \ --user <source_cluster_username>:<user_password_in_source_cluster> \ --header 'Content-Type: application/json' \ --request POST 'https://<address_of_OpenSearch_host_with_DATA_role>:9200/people/_doc/?pretty' -d' { "name": "Robert", "age": "32" } '
-
Make sure the data is saved in the test index:
curl --cacert ~/.opensearch/root.crt \ --user <username_in_target_cluster>:<user_password_in_target_cluster> \ --header 'Content-Type: application/json' \ --request GET 'https://<address_of_OpenSearch_host_with_DATA_role>:9200/people/_search?pretty'
Prepare and activate your transfer
-
Create a source endpoint for the Managed Service for OpenSearch cluster you created earlier with the following settings:
- Database type:
OpenSearch
. - Connection settings:
- Connection type:
Managed Service for OpenSearch cluster
. - Managed Service for OpenSearch cluster: Select a Managed Service for OpenSearch cluster from the list.
- User:
admin
. - Password:
admin
user password.
- Connection type:
- Database type:
-
Create a target endpoint for the Managed Service for Greenplum® cluster you created earlier, with the following settings:
- Database type:
Greenplum
. - Endpoint parameters:
- Connection type: Select
Managed Service for Greenplum cluster
. - Managed Service for Greenplum cluster: Select a Managed Service for Greenplum® cluster from the list.
- Database:
postgres
. - User: Enter a name for the Managed Service for Greenplum® cluster user.
- Password: Enter a password for the Managed Service for Greenplum® cluster user.
- Connection type: Select
- Database type:
-
Create a transfer:
ManuallyTerraform- Create a transfer of the Snapshot type that will use the created endpoints.
- Activate the transfer.
-
In the
opensearch-to-greenplum.tf
file, specify the following settings:source_endpoint_id
: Source endpoint ID.target_endpoint_id
: Target endpoint ID.transfer_enabled
:1
to create a transfer.
-
Make sure the Terraform configuration files are correct using this command:
terraform validate
If there are any errors in the configuration files, Terraform will point them out.
-
Create the required infrastructure:
-
Run this command to view the planned changes:
terraform plan
If you described the configuration correctly, the terminal will display a list of the resources to update and their parameters. This is a verification step that does not apply changes to your resources.
-
If everything looks correct, apply the changes:
-
Run this command:
terraform apply
-
Confirm updating the resources.
-
Wait for the operation to complete.
-
Once created, your transfer will be activated automatically.
-
Test the transfer
-
Wait for the transfer status to change to Completed.
-
Make sure the data from the source Managed Service for OpenSearch cluster has been migrated to the Managed Service for Greenplum® cluster:
-
Get an SSL certificate to connect to the Managed Service for Greenplum® cluster.
-
Install the dependencies:
sudo apt update && sudo apt install --yes postgresql-client
-
Connect to the database in the Managed Service for Greenplum® cluster.
-
Make sure the database contains the
people
table with test data:SELECT * FROM people;
-
Delete the resources you created
Some resources are not free of charge. To avoid paying for them, delete the resources you no longer need:
-
In the terminal window, go to the directory containing the infrastructure plan.
Warning
Make sure the directory has no Terraform manifests with the resources you want to keep. Terraform deletes all resources that were created using the manifests in the current directory.
-
Delete resources:
-
Run this command:
terraform destroy
-
Confirm deleting the resources and wait for the operation to complete.
All the resources described in the Terraform manifests will be deleted.
-