Migrating data from a third-party Elasticsearch cluster using the Reindex API
Warning
Yandex Managed Service for Elasticsearch is unavailable as of April 11, 2024.
You can create an OpenSearch cluster in Yandex Cloud as an alternative to Elasticsearch.
Managed Service for Elasticsearch clusters support reindexing via the Reindex API
To migrate data from the source cluster in Elasticsearch to the target cluster in Managed Service for Elasticsearch:
If you no longer need the resources you created, delete them.
Getting started
-
Create a Managed Service for Elasticsearch target cluster with any suitable configuration.
Enable cluster host access via public IPs.
Warning
The Elasticsearch version on the target cluster must be the same as or higher than on the source cluster.
-
Make sure you can connect to the Managed Service for Elasticsearch target cluster using the Elasticsearch API and Kibana.
-
Make sure the Elasticsearch source cluster can access the internet.
-
If a source cluster is using a self-signed certificate to encrypt the connection, add it as an extension to the target cluster.
-
Create a user with the
monitoring_user
and theviewer
roles in the target cluster.
Configure the target cluster
-
Create a role
with thecreate_index
andwrite
privileges for all indexes (*
). -
Create a user and assign this role to them.
Tip
In Managed Service for Elasticsearch clusters, you can use Reindex API as the
admin
user with thesuperuser
role; however, it is more secure to create separate users with limited privileges for each job. For more information, see Managing Elasticsearch users. -
-
Reindex remote whitelist: Specify the source cluster IP or FQDN, such as:
192.168.0.1:9200, example.com:9200
-
(Optional) Reindex SSL CA path: Specify the local path to the imported certificate as
/etc/elasticsearch/extensions/<extension_name>/<certificate_name>
.
-
Start reindexing
-
Retrieve the list of hosts in the target cluster.
-
To start reindexing, run a request against the target cluster's host with the Master node role:
curl --user <username_in_the_target_cluster>:<user_password_in_the_target_cluster> \ --request POST "https://<IP_or_FQDN_of_target_cluster_Master_Node_host>:9200/_reindex?pretty" \ --header 'Content-Type: application/json' \ --data '{ "source": { "remote": { "host": "https://<IP_or_FQDN_of_source_cluster_Master_Node_host>:9200", "username": "<username_in_the_source_cluster>", "password": "<user_password_in_the_source_cluster>" }, "index": "<source_cluster_index_alias_or_data_stream_name>" }, "dest": { "index": "<target_cluster_index_alias_or_data_stream_name>" } }'
To transfer several indexes, use a
for
loop:for index in <space-separated_list_of_index_alias_or_data_stream_names>; do curl --user <username_in_the_target_cluster>:<user_password_in_the_target_cluster> \ --request POST "https://<target_cluster_Mater_Node_IP_or_FQDN>:9200/_reindex?pretty" \ --header 'Content-Type: application/json' \ --data '{ "source": { "remote": { "host": "https://<source_cluster_Mater_Node_IP_or_FQDN>:9200", "username": "<username_in_the_source_cluster>", "password": "<user_password_in_the_source_cluster>" }, "index": "'$index'" }, "dest": { "index": "'$index'" } }' done
Delete the resources you no longer need
Delete the resources you no longer need to avoid paying for them:
- If you reserved public static IPs for cluster access, release and delete them.
- If you used a Yandex Object Storage bucket to import a self-signed certificate, clear and delete it.