Migrating a database from a third-party Valkey™ cluster to Yandex Managed Service for Valkey™
For data migration, Valkey™ uses a logical dump: this is a file with a sequence of commands to restore the state of databases in the cluster. There are several ways to create a dump. The following example will use redis-dump-go
Note
A binary RDB dump cannot be used for migration, because Yandex Managed Service for Valkey™ does not permit accessing file systems on cluster hosts.
To migrate Valkey™ databases from the source cluster to the target cluster:
-
(Optional) Upload the dump to the intermediate virtual machine.
You must transfer data to Yandex Compute Cloud using a virtual machine if:
- Your Yandex Managed Service for Valkey™ cluster is not accessible from the internet.
- Your hardware or connection to the cluster in Yandex Cloud is not very reliable.
If you no longer need the resources you created, delete them.
Getting started
Prepare the infrastructure
-
Create a Yandex Managed Service for Valkey™ cluster with any suitable configuration. To connect to a cluster from a user's local machine rather than a Yandex Cloud cloud network, enable TLS support and public host access when creating your cluster.
-
(Optional) Create an intermediate Linux virtual machine in Yandex Compute Cloud on the same network as the Yandex Managed Service for Valkey™ cluster using the following configuration:
-
Under Boot disk image, select the Ubuntu 20.04 image.
-
Under Network settings:
- Public IP:
Auto
. - Internal IPv4 address:
Auto
. - Security groups: Select the same security group as for the Yandex Managed Service for Valkey™ cluster.
- Public IP:
-
-
If you use Virtual Private Cloud security groups, configure them.
-
If you do not have Terraform yet, install it.
-
Get the authentication credentials. You can add them to environment variables or specify them later in the provider configuration file.
-
Configure and initialize a provider. There is no need to create a provider configuration file manually, you can download it
. -
Place the configuration file in a separate working directory and specify the parameter values. If you did not add the authentication credentials to environment variables, specify them in the configuration file.
-
Download the configuration file for the appropriate cluster type to the same working directory:
- redis-cluster-non-sharded.tf
: For an unsharded cluster. - redis-cluster-sharded.tf
: For a sharded cluster.
Each file describes:
- Network.
- Subnet.
- Default security group and rules required to connect to the cluster and the virtual machine.
- Yandex Managed Service for Valkey™ cluster with public internet access.
- (Optional) Virtual machine with public internet access.
- redis-cluster-non-sharded.tf
-
Specify the following in the configuration file:
-
Password to access the Yandex Managed Service for Valkey™ cluster.
-
(Optional) VM parameters:
- Public virtual machine image ID, e.g., for Ubuntu 20.04 LTS.
- Login and absolute path to the public SSH key for accessing the virtual machine. By default, the specified username is ignored in the Ubuntu 20.04 LTS image. A user with the
ubuntu
username is created instead. Use it to connect to the instance.
-
-
Check that the Terraform configuration files are correct using this command:
terraform validate
If there are any errors in the configuration files, Terraform will point them out.
-
Create the required infrastructure:
-
Run the command to view planned changes:
terraform plan
If the resource configuration descriptions are correct, the terminal will display a list of the resources to modify and their parameters. This is a test step. No resources are updated.
-
If you are happy with the planned changes, apply them:
-
Run the command:
terraform apply
-
Confirm the update of resources.
-
Wait for the operation to complete.
-
All the required resources will be created in the specified folder. You can check resource availability and their settings in the management console
. -
Install additional software
-
(Optional) Install utilities on the local machine for downloading and uploading files over SSH, such as:
-
Make sure that GNU Screen
is installed on the source cluster.It might take a long time to create and restore a dump. To keep these processes alive when your SSH session times out, start them using this utility. If your SSH connection breaks while creating or restoring the dump, reconnect and restore the session state using the command:
screen -R
Connect to the source cluster and create a logical dump
-
Connect to the source cluster's master host via SSH.
-
Download the archive with
redis-dump-go
from the project page . The examples below use version0.5.1
.wget https://github.com/yannh/redis-dump-go/releases/download/v0.5.1/redis-dump-go_0.5.1_linux_amd64.tar.gz
-
Unpack the archive to the current directory:
tar xf redis-dump-go_0.5.1_linux_amd64.tar.gz
-
Get familiar with the utility launch parameters:
./redis-dump-go -h
-
If connecting to the Valkey™ cluster requires a password, enter it in the
REDISDUMPGO_AUTH
environment variable:export REDISDUMPGO_AUTH="<Valkey™_password>"
-
Start an interactive
screen
session:screen
-
Launch the creation of a logical dump:
./redis-dump-go \ -host <master_host_IP_address_or_FQDN_in_Valkey™_cluster> \ -port <Valkey™_port> > <dump_file>
Tip
As the dump is created, the number of processed keys is shown on the screen. Remember or write down the last output value: you will need it to check whether the dump has been restored completely in the target cluster.
-
When the dump has been created, download it to your computer.
-
Complete the interactive
screen
session:exit
(Optional) Upload the dump to the intermediate virtual machine
-
Upload the dump from your computer to an intermediate virtual machine however is convenient.
Restore the dump on the target cluster
-
Connect to the cluster and run an interactive
screen
session:screen
-
Start the dump recovery process:
Connecting without TLSConnecting via TLSBefore connecting, install the dependencies:
sudo apt update && sudo apt install -y redis-tools
Connecting via Sentinel
host=$(redis-cli \ -h <FQDN_of_any_Valkey™_host> \ -p 26379 \ sentinel \ get-master-addr-by-name \ no-shards-no-tls | head -n 1) redis-cli \ -h ${host} \ -p 6379 \ -a <target_cluster_password> \ --pipe < <dump_file>
Connecting directly to the master host
redis-cli \ -h <master_host_FQDN> \ -p 6379 \ -a <target_cluster_password> \ --pipe < <dump_file>
When connecting to a non-sharded cluster, instead of the master host's FQDN, you can use special FQDNs.
Connecting to a sharded cluster
-
Create a script containing the dump-loading commands:
load-dump.sh
shards=('<FQDN_of_master_host_in_shard_1>' \ ... '<FQDN_of_master_host_in_shard_N>') for shard in "${shards[@]}" ; do redis-cli -h "${shard}" \ -p 6379 \ -a "<target_cluster_password>" \ --pipe < <dump_file> done
-
Run the script:
bash ./load-dump.sh
As you run the script, you will see messages about data insertion errors. This is normal behavior for the
redis-cli
command, because in a sharded cluster, each shard stores only a certain part of the data. For more information, see Sharding in Yandex Managed Service for Valkey™.
Before connecting, install the dependencies:
Build the
redis-tools
utility with TLS support in one of two ways:-
From a repository
-
Connect a repository:
sudo apt-add-repository ppa:redislabs/redis
Packages in this repository have already been built with the
BUILD_TLS=yes
flag. -
Install the utility:
sudo apt update && sudo apt install -y redis-tools
-
-
Manually
Go to the directory you want to download the distribution to. Download the stable version of the utility, then build and install it:
wget https://download.redis.io/redis-stable.tar.gz && \ tar -xzvf redis-stable.tar.gz && \ cd redis-stable && \ make BUILD_TLS=yes && \ sudo make install && \ sudo cp ./src/redis-cli /usr/bin/
Connecting via Sentinel
host=$(redis-cli \ -h <FQDN_of_any_Valkey™_host> \ -p 26379 \ sentinel \ get-master-addr-by-name \ no-shards-tls | head -n 1) redis-cli \ -h ${host} \ -p 6380 \ -a <target_cluster_password> \ --tls \ --cacert ~/.redis/YandexInternalRootCA.crt \ --pipe < <dump_file>
Connecting directly to the master host
redis-cli \ -h c-<cluster_ID>.rw.mdb.yandexcloud.net \ -p 6380 \ -a <target_cluster_password> \ --tls \ --cacert ~/.redis/YandexInternalRootCA.crt \ --pipe < <dump_file>
When connecting to a non-sharded cluster, instead of the master host's FQDN, you can use special FQDNs.
Connecting to a sharded cluster
-
Create a script containing the dump-loading commands:
load-dump.sh
shards=('<FQDN_of_master_host_in_shard_1>' \ ... '<FQDN_of_master_host_in_shard_N>') for shard in "${shards[@]}" ; do redis-cli -h "${shard}" \ -p 6380 \ -a "<target_cluster_password>" \ --tls \ --cacert ~/.redis/YandexInternalRootCA.crt \ --pipe < <dump_file> done
-
Run the script:
bash ./load-dump.sh
As you run the script, you will see messages about data insertion errors. This is normal behavior for the
redis-cli
command, because in a sharded cluster, each shard stores only a certain part of the data. For more information, see Sharding in Yandex Managed Service for Valkey™.
-
-
Complete the interactive
screen
session:exit
Make sure that the dump is restored completely
- In the management console
, go to the folder to restore the cluster in. - In the list of services, select Yandex Managed Service for Valkey™.
- Click the cluster name and open the Monitoring tab.
Pay attention to the DB Keys chart showing the number of keys stored in the cluster. If the cluster is sharded, the chart will show the number of keys in each shard. In this case, the number of keys in the cluster is equal to the total number of keys in the shards.
The total number of keys in the cluster must be equal to the number of keys processed by redis-dump-go
when creating the dump.
Delete the resources you created
Delete the resources you no longer need to avoid paying for them:
- Delete the Yandex Managed Service for Valkey™ cluster.
- If you created an intermediate virtual machine, delete it.
- If you reserved public static IP addresses, release and delete them.
-
In the terminal window, go to the directory containing the infrastructure plan.
Warning
Make sure the directory has no Terraform manifests with the resources you want to keep. Terraform deletes all resources that were created using the manifests in the current directory.
-
Delete resources:
-
Run this command:
terraform destroy
-
Confirm deleting the resources and wait for the operation to complete.
All the resources described in the Terraform manifests will be deleted.
-