Migrating data to Yandex StoreDoc
To migrate your data to Yandex StoreDoc, follow these steps: transfer the data, switch the legacy database to read-only mode, transfer the workload over to the target cluster in Yandex Cloud.
You can migrate data from a third-party source cluster to a Yandex StoreDoc target cluster using the following two methods:
-
Migrating data via Yandex Data Transfer.
This migration method allows you to:
- Migrate your database with zero downtime.
- Migrate from older to newer versions of MongoDB.
- Eliminate the need for an intermediate VM or for exposing your Yandex StoreDoc target cluster to the internet.
To use this method, enable public access to the source cluster.
-
A dump is a collection of files allowing you to restore a database to a specific state. To migrate data to a Yandex StoreDoc cluster, create a database dump using
mongodumpand restore it on the target cluster usingmongorestore. To ensure dump integrity, switch the source cluster toread-only
mode before creating the dump.Use this method only if data migration via Yandex Data Transfer is impossible.
Migrating data using Yandex Data Transfer
To transfer data:
- Create a target cluster.
- Prepare source and target clusters.
- Set up the endpoints and transfer.
- Transfer data.
If you no longer need the resources you created, delete them.
Getting started
Sign up for Yandex Cloud and create a billing account:
- Navigate to the management console
and log in to Yandex Cloud or create a new account. - On the Yandex Cloud Billing
page, make sure you have a billing account linked and it has theACTIVEorTRIAL_ACTIVEstatus. If you do not have a billing account, create one and link a cloud to it.
If you have an active billing account, you can create or select a folder for your infrastructure on the cloud page
Learn more about clouds and folders here.
Required paid resources
- Yandex StoreDoc cluster, which includes the use of computing resources allocated to hosts, storage and backup size (see Yandex StoreDoc pricing).
- Public IP addresses if public access is enabled for cluster hosts (see Virtual Private Cloud pricing).
- Each transfer: use of computing resources and the number of transferred data rows (see Data Transfer pricing).
Create a target cluster
Create a Yandex StoreDoc target cluster with computing capacity and storage size matching the source database’s environment.
The source and target database names must be the same.
Prepare source and target clusters
- Prepare the source cluster.
- Make sure the source cluster’s network settings allow cluster connections from the internet.
- Prepare the target cluster.
Set up the endpoints and transfer
-
Create a source endpoint with the following settings:
-
Database type:
MongoDB. -
Endpoint parameters → Connection settings:
Custom installation.Configure the source cluster connection settings.
Note
Transferring of
Time Seriescollections is not supported, so you should exclude such collections in the endpoint settings. -
-
Create a target endpoint with the following settings:
-
Database type:
MongoDB. -
Endpoint parameters → Connection settings:
Yandex StoreDoc cluster.Specify the target cluster ID.
-
-
Create a transfer of the Snapshot and increment type that will use the new endpoints.
To make large collections (over 1 GB) copy more quickly, enable parallel copy in the transfer settings. Specify two or more workers. The collection will be split into the specified number of parts that will be copied concurrently.
For parallel copy to work, the
_idfields in all documents within the collection must have the same data type. If a transfer detects a type mismatch, the collection will not be split but transferred in a single thread instead. If needed, remove documents with mismatched data types from the collection prior to transfer.Note
If a document with a different data type is added to the collection after the transfer starts, the transfer will migrate it at the replication stage after parallel copying. However, when reactivated, the transfer will not be able to split the collection into parts, since the requirement for the same
_idfield type in all documents of the collection will not be met.
Transfer the data
- Activate the transfer.
- Wait for the transfer status to change to Replicating.
- Switch the source cluster to
read-only
mode and transfer the workload over to the target cluster. - On the transfer monitoring page, wait until the Maximum data transfer delay value drops to zero. This means that all changes made in the source cluster after the initial data copy have been transferred to the target cluster.
Delete the resources you created
Some resources are not free of charge. Delete the resources you no longer need to avoid paying for them:
-
Deactivate the transfer and wait for its status to change to Stopped.
For more information about transfer statuses, see Transfer lifecycle.
-
Delete the source and target endpoints.
Migration via database dump
To transfer data via database dump:
- Create a target cluster.
- Create a dump of the source database using
mongodump. - If necessary, create a VM in Compute Cloud to restore the database from the dump within the Yandex Cloud infrastructure.
- Restore the data from the dump to the cluster using
mongorestore.
If you no longer need the resources you created, delete them.
Getting started
Sign up for Yandex Cloud and create a billing account:
- Navigate to the management console
and log in to Yandex Cloud or create a new account. - On the Yandex Cloud Billing
page, make sure you have a billing account linked and it has theACTIVEorTRIAL_ACTIVEstatus. If you do not have a billing account, create one and link a cloud to it.
If you have an active billing account, you can create or select a folder for your infrastructure on the cloud page
Learn more about clouds and folders here.
Required paid resources
- Yandex StoreDoc cluster, which includes the use of computing resources allocated to hosts, storage and backup size (see Yandex StoreDoc pricing).
- Public IP addresses if public access is enabled for cluster hosts (see Virtual Private Cloud pricing).
- VM instance: use of computing resources, storage, public IP address, and OS (see Compute Cloud pricing).
Create a target cluster
Create a Yandex StoreDoc target cluster with computing capacity and storage size matching the source database’s environment.
The source and target database names must be the same.
Create a dump
Use mongodump to create a database dump.
-
Install
mongodumpand other MongoDB tools. Example for Ubuntu 20.04 LTS:wget -qO - https://www.mongodb.org/static/pgp/server-4.4.asc | sudo apt-key add - echo "deb [ arch=amd64 ] https://repo.mongodb.org/apt/ubuntu focal/mongodb-org/4.4 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-4.4.list sudo apt update sudo apt install mongodb-org-shell mongodb-org-tools -
Before creating a dump, we recommend switching your database to read-only mode to avoid losing any data that might be written during the dump process.
-
Create a database dump:
mongodump --host <DBMS_server_address> \ --port <port> \ --username <username> \ --password "<password>" \ --db <DB_name> \ --out ~/db_dumpIf you can use multiple CPU cores for the dump, specify the
-jflag with the number of available cores:mongodump --host <DBMS_server_address> \ --port <port> \ --username <username> \ --password "<password>" \ -j <number_of_cores> \ --db <DB_name> \ --out ~/db_dump -
Archive the dump:
tar -cvzf db_dump.tar.gz ~/db_dump
Optionally, create a VM for dump upload
You will need an intermediate VM in Yandex Compute Cloud under the following conditions:
- Your Yandex StoreDoc cluster is not reachable from the internet.
- Your hardware or connection to the cluster in Yandex Cloud is not very reliable.
To prepare your virtual machine for dump recovery:
-
In the management console, create a new VM from an Ubuntu 20.04 LTS image. The necessary amount of RAM and the number of CPU cores depend on the volume of data transferred and the required transfer speed.
The minimum configuration (1 core, 2 GB RAM, 10 GB disk space) should be sufficient for migrating a database of up to 1 GB. The larger the database being migrated, the more disk space and RAM are required, with the available disk space at least twice the database size.
The VM must reside in the same network and availability zone as the Yandex StoreDoc cluster’s master host. The VM must have an external IP address, which will allow you to upload the dump from outside Yandex Cloud.
-
Install the MongoDB client and additional database utilities:
wget -qO - https://www.mongodb.org/static/pgp/server-4.4.asc | sudo apt-key add - echo "deb [ arch=amd64 ] https://repo.mongodb.org/apt/ubuntu focal/mongodb-org/4.4 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-4.4.list sudo apt update sudo apt install mongodb-org-shell mongodb-org-tools -
Move the database dump from your server to the VM with the help of
scpor a similar tool:scp ~/db_dump.tar.gz <VM_username>@<VM_public_address>:/tmp/db_dump.tar.gz -
Extract the dump on the virtual machine:
tar -xzf /tmp/db_dump.tar.gz
Now you have a VM with a database dump, ready to be restored to the Yandex StoreDoc cluster.
Restore the data
Restore your database from the dump via mongorestore.
-
If you are restoring a dump from a VM located in Yandex Cloud:
mongorestore --host <DBMS_server_address> \ --port <port> \ --username <username> \ --password "<password>" \ -j <number_of_streams> \ --authenticationDatabase <DB_name> \ --nsInclude '*.*' /tmp/db_dump -
If you are restoring a dump from a server outside Yandex Cloud, you must explicitly specify the SSL settings for
mongorestore:mongorestore --host <DBMS_server_address> \ --port <port> \ --ssl \ --sslCAFile <path_to_certificate_file> \ --username <username> \ --password "<password>" \ -j <number_of_streams> \ --authenticationDatabase <DB_name> \ --nsInclude '*.*' ~/db_dump -
To transfer only specific collections, use the
--nsIncludeand--nsExcludeflags to specify the namespaces that should be included and excluded from the collections being restored.
Delete the resources you created
Some resources are not free of charge. Delete the resources you no longer need to avoid paying for them:
- Delete the Yandex StoreDoc cluster.
- If you created a virtual machine to upload the dump to, delete it.
- If you reserved a public static IP address, release and delete it.