Migrating data to Managed Service for MongoDB
To migrate your data to Managed Service for MongoDB, transfer the data, write-lock the old database, and transfer the load to the target cluster in Yandex Cloud.
There are two ways to migrate data from a third-party source cluster to a Managed Service for MongoDB target cluster:
-
Transferring data using Yandex Data Transfer.
This migration method allows you to:
- Migrate the database without interrupting user service.
- Migrate from older MongoDB versions to newer versions.
- Go without creating an intermediate VM or granting online access to your Managed Service for MongoDB target cluster.
To use this migration method, allow connecting to the source cluster from the internet.
For more information, see What tasks Yandex Data Transfer is used for.
-
Migrating a database using a dump.
A dump is a set of files using which you can restore the state of a database. To migrate data to a Managed Service for MongoDB cluster, create a database dump using
mongodump
and restore it in the target cluster usingmongorestore
. To achieve a full dump, before you create it, switch the source cluster toread-only
mode.
Getting started
Create a Managed Service for MongoDB target cluster with the computing capacity and storage size appropriate for the environment where the migrated database is deployed.
The database name in the target cluster must be the same as the source database name.
Migrating data using Yandex Data Transfer
-
Create a source endpoint with the following parameters:
-
Database type:
MongoDB
-
Endpoint parameters → Connection settings:
Custom installation
Specify the parameters for connecting to the source cluster.
Note
The service does not support transferring Time Series collections
, so you should exclude these collections in the endpoint settings. -
-
Create a target endpoint with the following parameters:
-
Database type:
MongoDB
-
Endpoint parameters → Connection settings:
Managed Service for MongoDB cluster
Specify the ID of the target cluster.
-
-
Create a transfer of the Snapshot and increment type that will use the created endpoints.
To make large collections (over 1 GB) copy more quickly, enable parallel copy in the transfer settings. Specify two or more workers. The collection will split into the specified number of parts that will be copied concurrently.
For parallel copy to work, the data type
in the_id
field should be the same for all documents in a collection. If a transfer discovers a type mismatch, the collection will not be partitioned but transferred in a single thread instead. If needed, remove documents with mismatched data types from the collection before starting a transfer.Note
If a document with a different data type is added to a collection after a transfer starts, the transfer will move it at the replication stage after the parallel copy operation is completed. However, when re-enabled, the transfer will not be able to partition a collection because the
_id
field's type requirement will not be met for some of the documents in the collection. -
Wait for the transfer status to change to Replicating.
-
Switch the source cluster to read-only and transfer the load to the target cluster.
-
On the transfer monitoring page, wait for the Maximum data transfer delay metric to decrease to zero. This means that all changes that occurred in the source cluster after data copying was completed are transferred to the target cluster.
-
Deactivate the transfer and wait for its status to change to Stopped.
For more information about transfer statuses, see Transfer lifecycle.
-
Delete endpoints for both the source and target.
Migrating a database using a dump
Sequence of actions:
- Create a dump of the database you want to migrate using
mongodump
. - If necessary, create a VM in Compute Cloud to restore the database from a dump in the Yandex Cloud infrastructure.
- Restore data from the dump in the cluster using
mongorestore
.
Create a dump
You can create a database dump using mongodump
. For more information about this utility, see the MongoDB documentation
-
Install
mongodump
and other utilities for working with MongoDB. Example for Ubuntu 20.04 LTS:wget -qO - https://www.mongodb.org/static/pgp/server-4.4.asc | sudo apt-key add - echo "deb [ arch=amd64 ] https://repo.mongodb.org/apt/ubuntu focal/mongodb-org/4.4 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-4.4.list sudo apt update sudo apt install mongodb-org-shell mongodb-org-tools
Instructions for other platforms, as well as more information about installing utilities, can be found on the Install MongoDB
page. -
Before creating a dump, we recommend switching the DBMS to "read-only" to avoid losing data that might appear while the dump is created.
-
Create a database dump:
mongodump --host <DBMS_server_address> \ --port <port> \ --username <username> \ --password "<password>" \ --db <DB_name> \ --out ~/db_dump
If you have access to multiple processor cores to create a dump, specify the
-j
flag with the number of cores available to you:mongodump --host <DBMS_server_address> \ --port <port> \ --username <username> \ --password "<password>" \ -j <number_of_cores> \ --db <DB_name> \ --out ~/db_dump
-
Archive the dump:
tar -cvzf db_dump.tar.gz ~/db_dump
(Optional) Create a VM to download a dump
You will need an intermediate VM in Yandex Compute Cloud if:
- Your Managed Service for MongoDB cluster is not accessible from the internet.
- Your hardware or connection to the cluster in Yandex Cloud is not very reliable.
To prepare the virtual machine to restore the dump:
-
In the management console, create a new VM from an Ubuntu 20.04 LTS image. The required amount of RAM and processor cores depends on the amount of data to migrate and the required migration speed.
The minimum configuration (1 core, 2 GB RAM, 10 GB disk space) should be sufficient to migrate a database up to 1 GB in size. The larger the database, the more disk storage (at least twice the size of the database) and RAM you need for migration.
The VM instance must be in the same network and availability zone as the Managed Service for MongoDB cluster master host. The VM must be also assigned an external IP address so that you can upload the dump file from outside Yandex Cloud.
-
Install the MongoDB client and additional utilities for working with the DBMS:
wget -qO - https://www.mongodb.org/static/pgp/server-4.4.asc | sudo apt-key add - echo "deb [ arch=amd64 ] https://repo.mongodb.org/apt/ubuntu focal/mongodb-org/4.4 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-4.4.list sudo apt update sudo apt install mongodb-org-shell mongodb-org-tools
-
Move the DB dump from your server to the VM. For example, you can use
scp
:scp ~/db_dump.tar.gz <VM_user_name>@<VM_public_address>:/tmp/db_dump.tar.gz
-
Unpack the dump on the virtual machine:
tar -xzf /tmp/db_dump.tar.gz
As a result, you will get a VM with a database dump that is ready to be restored to the Managed Service for MongoDB cluster.
Recover data
Use the mongorestore
-
If you restore a dump from the VM in Yandex Cloud:
mongorestore --host <DBMS_server_address> \ --port <port> \ --username <username> \ --password "<password>" \ -j <number_of_streams> \ --authenticationDatabase <DB_name> \ --nsInclude '*.*' /tmp/db_dump
-
If you restore from a dump stored on a server outside Yandex Cloud, SSL parameters must be explicitly set for
mongorestore
:mongorestore --host <DBMS_server_address> \ --port <port> \ --ssl \ --sslCAFile <certificate_file_path> \ --username <username> \ --password "<password>" \ -j <number_of_streams> \ --authenticationDatabase <DB_name> \ --nsInclude '*.*' ~/db_dump
-
If you want to transfer specific collections, set the
--nsInclude
and--nsExclude
flags, specifying the namespaces to include or exclude for the collections to restore.