Migrating to Managed Service for Elasticsearch using snapshots
Warning
Yandex Managed Service for Elasticsearch is unavailable as of April 11, 2024.
You can create an OpenSearch cluster in Yandex Cloud as an alternative to Elasticsearch.
Managed Service for Elasticsearch clusters support snapshots. This allows you to migrate data from another Elasticsearch cluster to them. For more information about snapshots, see the Elasticsearch documentation
To migrate data from the Elasticsearch source cluster to the Managed Service for Elasticsearch target cluster:
- Set up a work environment.
- Create a snapshot in the source cluster.
- Restore the snapshot in the target cluster.
- Complete your migration.
If you no longer need the resources you are using, delete them.
Warning
You cannot use a snapshot if the Elasticsearch version in the source cluster is higher than that in the target cluster. For example, you will not be able to restore a snapshot of an Elasticsearch 7.13 cluster in a Managed Service for Elasticsearch 7.11 cluster.
Set up a working environment
Create the required resources
-
Create an Object Storage bucket with restricted access. This bucket will be used as a snapshot repository.
-
Create a service account and assign it the
storage.editor
role. A service account is required to access the bucket from the source and target clusters. -
Create a static access key for the service account.
Warning
Save the key ID and secret key. You will need them in the next steps.
-
Create a Managed Service for Elasticsearch target cluster in desired configuration with the following settings:
- Public access to hosts.
- The Elasticsearch version must be the same or higher than the version in the source cluster.
Before creating a target cluster, check the compatibility
of the source cluster and the selected version of the target cluster. -
Install the
repository-s3
plugin in the target cluster.
-
If you do not have Terraform yet, install it.
-
Get the authentication credentials. You can add them to environment variables or specify them later in the provider configuration file.
-
Configure and initialize a provider. There is no need to create a provider configuration file manually, you can download it
. -
Place the configuration file in a separate working directory and specify the parameter values. If you did not add the authentication credentials to environment variables, specify them in the configuration file.
-
Download the mes-migration.tf
configuration file to the same working directory. The file describes:- Network.
- Subnet.
- Security group and rules required to access the Managed Service for Elasticsearch target cluster.
sa-mes-cluster
service account required to create a Managed Service for Elasticsearch cluster.sa-bucket
service account to work with the Object Storage bucket.- Target cluster with the repository-s3
plugin installed.
-
Specify the following under
locals
in themes-migration.tf
configuration file:- Folder ID
admin
user password- Target cluster edition
- Target cluster version
- Object Storage bucket name
-
Check that the Terraform configuration files are correct using this command:
terraform validate
If there are any errors in the configuration files, Terraform will point them out.
-
Create the required infrastructure:
-
Run the command to view planned changes:
terraform plan
If the resource configuration descriptions are correct, the terminal will display a list of the resources to modify and their parameters. This is a test step. No resources are updated.
-
If you are happy with the planned changes, apply them:
-
Run the command:
terraform apply
-
Confirm the update of resources.
-
Wait for the operation to complete.
-
All the required resources will be created in the specified folder. You can check resource availability and their settings in the management console
. -
-
Create a static access key for the
sa-bucket
service account.Warning
Save the key ID and secret key. You will need them in the next steps.
Complete the configuration and check access to resources
-
- In the Select a user drop-down list, specify the created service account.
- Select the
READ and WRITE
permissions for the selected service account. - Click Add.
- Click Save.
-
Install the
repository-s3
plugin on all source cluster hosts. -
For the
repository-s3
plugin to work, restart Elasticsearch and Kibana on all source cluster hosts. -
Make sure you can connect to the Managed Service for Elasticsearch target cluster using the Elasticsearch API and Kibana.
-
Make sure the Elasticsearch source cluster can access the internet.
Create a snapshot on the source cluster
-
Connect the bucket as a snapshot repository on the source cluster:
-
Add data about the static access key to the Elasticsearch keystore
.Run the procedure on all hosts of the source cluster.
Add the following:
-
Key ID:
$ES_PATH/bin/elasticsearch-keystore add s3.client.default.access_key
-
Secret key:
$ES_PATH/bin/elasticsearch-keystore add s3.client.default.secret_key
Note
The path to Elasticsearch (
$ES_PATH
) depends on the selected installation method. To find out the path to the installed Elasticsearch, see the installation documentation (for example, for DEB and RPM ).
-
-
Upload the data from the keystore:
curl -X POST "https://<source_cluster_FQDN>:9200/_nodes/reload_secure_settings"
-
Register the repository:
curl "https://<source_cluster_FQDN>:9200/_snapshot/<repository_name>" \ -X PUT \ -H 'Content-Type: application/json' -d ' { "type": "s3", "settings": { "bucket": "<bucket_name>", "endpoint": "storage.yandexcloud.net" } }'
For more information about connecting the repository, see the plugin documentation
.Alert
If a bucket is registered in an Elasticsearch cluster as a snapshot repository, do not edit the bucket contents manually as this will disrupt the Elasticsearch snapshot mechanism.
-
-
Start creating a snapshot on the source cluster. You can create a snapshot of the entire cluster or some of the data. Place the snapshot in the repository created in the previous step. For more information, see the Elasticsearch documentation
.Example of creating a snapshot named
snapshot_1
for the entire cluster:curl -X PUT \ "https://<source_cluster_FQDN>:9200/_snapshot/<repository_name>/snapshot_1?wait_for_completion=true&pretty"
Creating a snapshot may take a long time. Track the progress of the operation using Elasticsearch tools
, such as:curl -X GET \ "https://<source_cluster_FQDN>:9200/_snapshot/<repository_name>/snapshot_1/_status?pretty"
Restore a snapshot on the target cluster
-
Configure access to the bucket with snapshots for the target cluster. Use the service account you previously created.
-
Connect the Object Storage bucket to the target cluster as a snapshot storage:
curl "https://admin:<admin_user_password>@<host_FQDN>:9200/_snapshot/<repository_name>" \ -X PUT \ -H 'Content-Type: application/json' -d ' { "type": "s3", "settings": { "bucket": "<bucket_name>", "endpoint": "storage.yandexcloud.net" } }'
-
Select how to restore an index on the target cluster.
With the default settings, an attempt to restore an index will fail in a cluster where the same-name index is already open. Even in Managed Service for Elasticsearch clusters without user data, there are open system indexes (such as
.apm-custom-link
or.kibana_*
, etc.), which may interfere with the restore operation. To avoid this, use one of the restore policies:-
Existing system indices are not migrated. Only indices created on the source cluster by the user are involved in the import process.
-
Delete and restore
: Existing indexes are closed and deleted with new empty indexes created with the same names and then populated with data from a snapshot. -
Rename on restore
: Existing indexes are not affected, new indexes are created with other names. Snapshot data is restored to the new indexes.
Warning
Closing all indexes will make Kibana temporarily unavailable. Once the system indexes are opened, Kibana will be available again.
For example, the following command closes all indexes in the target cluster:
curl -X POST \ "https://admin:<admin_user_password>@<target_cluster_FQDN>:9200/_all/_close?pretty"
Example of restoring the entire snapshot:
curl -X POST \ "https://admin:<admin_user_password>@<target_cluster_FQDN>:9200/_snapshot/<repository_name>/snapshot_1/_restore"
-
-
Start restoring data from the snapshot on the target cluster. You can restore the entire snapshot or individual indexes. For more information, see the Elasticsearch documentation
.Example of restoring a snapshot with indication of the custom indexes to be restored on the target cluster:
curl -X POST \ -H 'Content-Type: application/json' 'https://admin:<admin_user_password>@<target_cluster_FQDN>:9200/_snapshot/<repository_name>/snapshot_1/_restore' -d ' { "indices": "<list_of_indexes>" }'
Where
list of indexes
is a list of comma-separated indexes to restore, e.g.,my_index*, my_index_2.*
. Transferring only the user indexes will enable you to avoid errors when restoring the snapshot. System indexes are not affected.Restoring a snapshot may take a long time. Track the progress of the operation using Elasticsearch tools
, such as:curl -X GET \ "https://admin:<admin_user_password>@<target_cluster_FQDN>:9200/_snapshot/<repository_name>/snapshot_1/_status?pretty"
-
If necessary, after the restore operation is completed, open all closed indexes
.For example, the following command opens all indexes in the target cluster:
curl -X POST \ "https://admin:<admin_user_password>@<target_cluster_FQDN>:9200/_all/_open?pretty"
Complete your migration
-
Make sure that all required data has been transferred to the Managed Service for Elasticsearch target cluster.
You can check this, for example, using Kibana.
-
If necessary, disable the snapshot repository
on the side of the source and target clusters.
Delete the resources you created
Some resources are not free of charge. To avoid paying for them, delete the resources you no longer need:
- Delete the service account
- Delete snapshots from the bucket and then delete the entire bucket.
- Delete the Managed Service for Elasticsearch cluster.
-
In the terminal window, go to the directory containing the infrastructure plan.
Warning
Make sure the directory has no Terraform manifests with the resources you want to keep. Terraform deletes all resources that were created using the manifests in the current directory.
-
Delete resources:
-
Run this command:
terraform destroy
-
Confirm deleting the resources and wait for the operation to complete.
All the resources described in the Terraform manifests will be deleted.
-