Yandex Cloud
Search
Contact UsGet started
  • Pricing
  • Customer Stories
  • Documentation
  • Blog
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • AI for business
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Start testing with double trial credits
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Center for Technologies and Society
    • Yandex Cloud Partner program
  • Pricing
  • Customer Stories
  • Documentation
  • Blog
© 2025 Direct Cursus Technology L.L.C.
Tutorials
    • All tutorials
    • Unassisted deployment of the Apache Kafka® web interface
    • Upgrading a Managed Service for Apache Kafka® cluster to migrate from ZooKeeper to KRaft
    • Migrating a database from a third-party Apache Kafka® cluster to Managed Service for Apache Kafka®
    • Moving data between Managed Service for Apache Kafka® clusters using Data Transfer
    • Delivering data from Managed Service for MySQL® to Managed Service for Apache Kafka® using Data Transfer
    • Delivering data from Managed Service for MySQL® to Managed Service for Apache Kafka® using Debezium
    • Delivering data from Managed Service for PostgreSQL to Managed Service for Apache Kafka® using Data Transfer
    • Delivering data from Managed Service for PostgreSQL to Managed Service for Apache Kafka® using Debezium
    • Delivering data from Managed Service for YDB to Managed Service for Apache Kafka® using Data Transfer
    • Delivering data from Managed Service for Apache Kafka® to Managed Service for ClickHouse® using Data Transfer
    • Delivering data from Managed Service for Apache Kafka® to Yandex MPP Analytics for PostgreSQL using Data Transfer
    • Delivering data from Managed Service for Apache Kafka® to Yandex StoreDoc using Data Transfer
    • Delivering data from Managed Service for Apache Kafka® to Managed Service for MySQL® using Data Transfer
    • Delivering data from Managed Service for Apache Kafka® to Managed Service for OpenSearch using Data Transfer
    • Delivering data from Managed Service for Apache Kafka® to Managed Service for PostgreSQL using Data Transfer
    • Delivering data from Managed Service for Apache Kafka® to Managed Service for YDB using Data Transfer
    • Delivering data from Managed Service for Apache Kafka® to Data Streams using Data Transfer
    • Delivering data from Data Streams to Managed Service for YDB using Data Transfer
    • Delivering data from Data Streams to Managed Service for Apache Kafka® using Data Transfer
    • YDB change data capture and delivery to YDS
    • Configuring Kafka Connect to work with a Managed Service for Apache Kafka® cluster
    • Synchronizing Apache Kafka® topics in Object Storage with no web access
    • Monitoring message loss in an Apache Kafka® topic
    • Automating Query tasks with Managed Service for Apache Airflow™
    • Sending requests to the Yandex Cloud API via the Yandex Cloud Python SDK
    • Configuring an SMTP server to send e-mail notifications
    • Adding data to a ClickHouse® DB
    • Migrating data to Managed Service for ClickHouse® using ClickHouse® tools
    • Migrating data to Managed Service for ClickHouse® using Data Transfer
    • Delivering data from Managed Service for MySQL® to Managed Service for ClickHouse® using Data Transfer
    • Asynchronously replicating data from PostgreSQL to ClickHouse®
    • Exchanging data between Managed Service for ClickHouse® and Yandex Data Processing
    • Configuring Managed Service for ClickHouse® for Graphite
    • Fetching data from Managed Service for Apache Kafka® to Managed Service for ClickHouse®
    • Fetching data from Managed Service for Apache Kafka® to ksqlDB
    • Fetching data from RabbitMQ to Managed Service for ClickHouse®
    • Saving a data stream from Data Streams to Managed Service for ClickHouse®
    • Asynchronous replication of data from Yandex Metrica to ClickHouse® using Data Transfer
    • Using hybrid storage in Managed Service for ClickHouse®
    • Sharding Managed Service for ClickHouse® tables
    • Loading data from Yandex Direct to a Managed Service for ClickHouse® data mart using Cloud Functions, Object Storage, and Data Transfer
    • Loading data from Object Storage to Managed Service for ClickHouse® using Data Transfer
    • Migrating data with change of storage from Managed Service for OpenSearch to Managed Service for ClickHouse® using Data Transfer
    • Loading data from Managed Service for YDB to Managed Service for ClickHouse® using Data Transfer
    • Yandex Managed Service for ClickHouse® integration with Microsoft SQL Server via ClickHouse® JDBC Bridge
    • Migrating databases from Google BigQuery to Managed Service for ClickHouse®
    • Yandex Managed Service for ClickHouse® integration with Oracle via ClickHouse® JDBC Bridge
    • Configuring Cloud DNS to access a Managed Service for ClickHouse® cluster from other cloud networks
    • Migrating a Yandex Data Processing HDFS cluster to a different availability zone
    • Importing data from Managed Service for MySQL® to Yandex Data Processing using Sqoop
    • Importing data from Managed Service for PostgreSQL to Yandex Data Processing using Sqoop
    • Mounting Object Storage buckets to the file system of Yandex Data Processing hosts
    • Working with Apache Kafka® topics using Yandex Data Processing
    • Automating operations with Yandex Data Processing using Managed Service for Apache Airflow™
    • Shared use of Yandex Data Processing tables through Apache Hive™ Metastore
    • Transferring metadata across Yandex Data Processing clusters using Apache Hive™ Metastore
    • Importing data from Object Storage, processing it, and exporting it to Managed Service for ClickHouse®
    • Migrating collections from a third-party MongoDB cluster to Yandex StoreDoc
    • Migrating data to Yandex StoreDoc
    • Migrating Yandex StoreDoc cluster from 4.4 to 6.0
    • Sharding Yandex StoreDoc collections
    • Yandex StoreDoc performance analysis and tuning
    • Managed Service for MySQL® performance analysis and tuning
    • Syncing data from a third-party MySQL® cluster to Managed Service for MySQL® using Data Transfer
    • Migrating a database from Managed Service for MySQL® to a third-party MySQL® cluster
    • Migrating a database from Managed Service for MySQL® to Object Storage using Data Transfer
    • Migrating data from Object Storage to Managed Service for MySQL® using Data Transfer
    • Delivering data from Managed Service for MySQL® to Managed Service for Apache Kafka® using Data Transfer
    • Delivering data from Managed Service for MySQL® to Managed Service for Apache Kafka® using Debezium
    • Migrating a database from Managed Service for MySQL® to Managed Service for YDB using Data Transfer
    • MySQL® change data capture and delivery to YDS
    • Migrating data from Managed Service for MySQL® to Managed Service for PostgreSQL using Data Transfer
    • Migrating data from AWS RDS for PostgreSQL to Managed Service for PostgreSQL using Data Transfer
    • Migrating data from Managed Service for MySQL® to Yandex MPP Analytics for PostgreSQL using Data Transfer
    • Configuring an index policy in Managed Service for OpenSearch
    • Migrating data from a third-party OpenSearch cluster to Managed Service for OpenSearch using Data Transfer
    • Loading data from Managed Service for OpenSearch to Object Storage using Data Transfer
    • Migrating data from Managed Service for OpenSearch to Managed Service for YDB using Data Transfer
    • Copying data from Managed Service for OpenSearch to Yandex MPP Analytics for PostgreSQL using Yandex Data Transfer
    • Migrating data from Managed Service for PostgreSQL to Managed Service for OpenSearch using Data Transfer
    • Authenticating a Managed Service for OpenSearch cluster in OpenSearch Dashboards using Keycloak
    • Using the yandex-lemmer plugin in Managed Service for OpenSearch
    • Creating a PostgreSQL cluster for 1C:Enterprise
    • Searching for the Managed Service for PostgreSQL cluster performance issues
    • Managed Service for PostgreSQL performance analysis and tuning
    • Logical replication in PostgreSQL
    • Migrating a database from a third-party PostgreSQL cluster to Managed Service for PostgreSQL
    • Migrating a database from Managed Service for PostgreSQL
    • Delivering data from Managed Service for PostgreSQL to Managed Service for Apache Kafka® using Data Transfer
    • Delivering data from Managed Service for PostgreSQL to Managed Service for Apache Kafka® using Debezium
    • Delivering data from Managed Service for PostgreSQL to Managed Service for YDB using Data Transfer
    • Migrating a database from Managed Service for PostgreSQL to Object Storage
    • Migrating data from Object Storage to Managed Service for PostgreSQL using Data Transfer
    • PostgreSQL change data capture and delivery to YDS
    • Migrating data from Managed Service for PostgreSQL to Managed Service for MySQL® using Data Transfer
    • Migrating data from Managed Service for PostgreSQL to Managed Service for OpenSearch using Data Transfer
    • Fixing string sorting issues in PostgreSQL after upgrading glibc
    • Migrating a database from Greenplum® to ClickHouse®
    • Migrating a database from Greenplum® to PostgreSQL
    • Exporting Greenplum® data to a cold storage in Object Storage
    • Loading data from Object Storage to Yandex MPP Analytics for PostgreSQL using Data Transfer
    • Copying data from Managed Service for OpenSearch to Yandex MPP Analytics for PostgreSQL using Yandex Data Transfer
    • Creating an external table from an Object Storage bucket table using a configuration file
    • Getting data from external sources using named queries in Greenplum®
    • Migrating a database from a third-party Valkey™ cluster to Yandex Managed Service for Valkey™
    • Using a Yandex Managed Service for Valkey™ cluster as a PHP session storage
    • Loading data from Object Storage to Managed Service for YDB using Data Transfer
    • Loading data from Managed Service for YDB to Object Storage using Data Transfer
    • Processing Audit Trails events
    • Processing Cloud Logging logs
    • Processing Debezium CDC streams
    • Analyzing data with Jupyter
    • Processing files with usage details in Yandex Cloud Billing
    • Ingesting data into storage systems
    • Smart log processing
    • Data transfer in microservice architectures
    • Migrating data to Object Storage using Data Transfer
    • Migrating data from a third-party Greenplum® or PostgreSQL cluster to Yandex MPP Analytics for PostgreSQL using Data Transfer
    • Migrating Yandex StoreDoc clusters
    • Migrating MySQL® clusters
    • Migrating to a third-party MySQL® cluster
    • Migrating PostgreSQL clusters
    • Creating a schema registry to deliver data in Debezium CDC format from Apache Kafka®
    • Automating operations using Yandex Managed Service for Apache Airflow™
    • Working with an Object Storage table from a PySpark job
    • Integrating Yandex Managed Service for Apache Spark™ with Apache Hive™ Metastore
    • Running a PySpark job using Yandex Managed Service for Apache Airflow™
    • Using Yandex Object Storage in Yandex Managed Service for Apache Spark™

In this article:

  • Required paid resources
  • Getting started
  • Migrating data using Yandex Data Transfer
  • Migrating a database using a dump
  • Create a dump
  • Optionally, create a VM to download a dump
  • Recover data
  1. Building a data platform
  2. Migrating data to Yandex StoreDoc

Migrating data to Yandex StoreDoc

Written by
Yandex Cloud
Updated at October 30, 2025
  • Required paid resources
  • Getting started
  • Migrating data using Yandex Data Transfer
  • Migrating a database using a dump
    • Create a dump
    • Optionally, create a VM to download a dump
    • Recover data

To migrate your data to Yandex StoreDoc, transfer the data, write-lock the old database, and transfer the load to the target cluster in Yandex Cloud.

There are two ways to migrate data from a third-party source cluster to a Yandex StoreDoc target cluster:

  • Transferring data using Yandex Data Transfer.

    This migration method allows you to:

    • Migrate the database without interrupting user service.
    • Migrate from older MongoDB versions to newer versions.
    • Go without creating an intermediate VM or granting online access to your Yandex StoreDoc target cluster.

    To use this migration method, allow connecting to the source cluster from the internet.

    For more information, see Problems addressed by Yandex Data Transfer.

  • Migrating a database using a dump.

    A dump is a set of files using which you can restore the state of a database. To migrate data to a Yandex StoreDoc cluster, create a database dump using mongodump and restore it in the target cluster using mongorestore. To achieve a full dump, switch the source cluster to read-only before you create it.

Required paid resourcesRequired paid resources

The cost of transferring data with Yandex Data Transfer includes:

  • Yandex StoreDoc target cluster fee: use of computing resources allocated to hosts and disk space (see Yandex StoreDoc pricing).
  • Fee for public IP addresses if public access is enabled for cluster hosts (see Virtual Private Cloud pricing).
  • Per-transfer fee: use of computing resources and number of transferred data rows (see Data Transfer pricing).

The cost of transferring data using a database dump includes:

  • Yandex StoreDoc target cluster fee: use of computing resources allocated to hosts and disk space (see Yandex StoreDoc pricing).
  • Fee for public IP addresses if public access is enabled for cluster hosts (see Virtual Private Cloud pricing).
  • When creating a VM to download a dump: fee for the use of computing resources, storage, OS (for specific operating systems), and, optionally, public IP address (see Compute Cloud pricing).

Getting startedGetting started

Create a Yandex StoreDoc target cluster with the computing capacity and storage size appropriate for the environment where the migrated database is deployed.

The source and target database names must be the same.

Migrating data using Yandex Data TransferMigrating data using Yandex Data Transfer

  1. Prepare the source cluster.

  2. Prepare the target cluster.

  3. Create a source endpoint with the following parameters:

    • Database type: MongoDB.

    • Endpoint parameters → Connection settings: Custom installation.

      Configure the source cluster connection settings.

    Note

    Transferring of Time Series collections is not supported, so you should exclude such collections in the endpoint settings.

  4. Create a target endpoint with the following parameters:

    • Database type: MongoDB.

    • Endpoint parameters → Connection settings: Yandex StoreDoc cluster.

      Specify the ID of the target cluster.

  5. Create a Snapshot and increment-type transfer and configure it to use the previously created endpoints.

    To make large collections (over 1 GB) copy more quickly, enable parallel copy in the transfer settings. Specify two or more workers. The collection will be split into the specified number of parts that will be copied concurrently.

    For parallel copy to work, the _id field data type must be the same for all documents in the same collection. If a transfer discovers a type mismatch, the collection will not be partitioned but transferred in a single thread instead. If needed, remove documents with mismatched data types from the collection before starting a transfer.

    Note

    If a document with a different data type is added to a collection after a transfer starts, the transfer will move it at the replication stage after the parallel copy operation is completed. However, when re-activated, the transfer will not be able to partition a collection because the _id field type requirement will not be met for some of the documents in the collection.

  6. Activate the transfer.

  7. Wait for the transfer status to change to Replicating.

  8. Switch the source cluster to "read-only" mode and transfer the load to the target cluster.

  9. On the transfer monitoring page, wait for the Maximum data transfer delay metric to decrease to zero. This means that all changes that occurred in the source cluster after data copying was completed are transferred to the target cluster.

  10. Deactivate the transfer and wait for its status to change to Stopped.

    For more information about transfer statuses, see Transfer lifecycle.

  11. Delete the stopped transfer.

  12. Delete the source and target endpoints.

Migrating a database using a dumpMigrating a database using a dump

Sequence of actions:

  1. Create a dump of the database you want to migrate using mongodump.
  2. If necessary, create a VM in Compute Cloud to restore the database from a dump in the Yandex Cloud infrastructure.
  3. Restore data from the dump in the cluster using mongorestore.

Create a dumpCreate a dump

You can create a database dump using mongodump.

  1. Install mongodump and other utilities for working with MongoDB. Example for Ubuntu 20.04 LTS:

    wget -qO - https://www.mongodb.org/static/pgp/server-4.4.asc | sudo apt-key add -
    echo "deb [ arch=amd64 ] https://repo.mongodb.org/apt/ubuntu focal/mongodb-org/4.4 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-4.4.list
    sudo apt update
    sudo apt install mongodb-org-shell mongodb-org-tools
    
  2. Before creating a dump, we recommend switching the DBMS to "read-only" to avoid losing data that might appear while the dump is created.

  3. Create a database dump:

    mongodump --host <DBMS_server_address> \
              --port <port> \
              --username <username> \
              --password "<password>" \
              --db <DB_name> \
              --out ~/db_dump
    

    If you have access to multiple processor cores to create a dump, specify the -j flag with the number of cores available to you:

    mongodump --host <DBMS_server_address> \
              --port <port> \
              --username <username> \
              --password "<password>" \
              -j <number_of_cores> \
              --db <DB_name> \
              --out ~/db_dump
    
  4. Archive the dump:

    tar -cvzf db_dump.tar.gz ~/db_dump
    

Optionally, create a VM to download a dumpOptionally, create a VM to download a dump

You will need an intermediate VM in Yandex Compute Cloud if:

  • Your Yandex StoreDoc cluster is not accessible from the internet.
  • Your hardware or connection to the cluster in Yandex Cloud is not very reliable.

To prepare the virtual machine to restore the dump:

  1. In the management console, create a new VM from an Ubuntu 20.04 LTS image. The required amount of RAM and processor cores depends on the amount of data to migrate and the required migration speed.

    The minimum configuration (1 core, 2 GB RAM, 10 GB disk space) should be sufficient to migrate a database up to 1 GB in size. The larger the database, the more disk storage (at least twice the size of the database) and RAM you need for migration.

    The VM instance must be in the same network and availability zone as the Yandex StoreDoc cluster master host. The VM must be also assigned an external IP address so that you can upload the dump file from outside Yandex Cloud.

  2. Install the MongoDB client and additional utilities for working with the DBMS:

    wget -qO - https://www.mongodb.org/static/pgp/server-4.4.asc | sudo apt-key add -
    echo "deb [ arch=amd64 ] https://repo.mongodb.org/apt/ubuntu focal/mongodb-org/4.4 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-4.4.list
    sudo apt update
    sudo apt install mongodb-org-shell mongodb-org-tools
    
  3. Move the DB dump from your server to the VM. For example, you can use scp:

    scp ~/db_dump.tar.gz <VM_user_name>@<VM_public_address>:/tmp/db_dump.tar.gz
    
  4. Unpack the dump on the virtual machine:

    tar -xzf /tmp/db_dump.tar.gz
    

As a result, you will get a VM with a database dump that is ready to be restored to the Yandex StoreDoc cluster.

Recover dataRecover data

Use the mongorestore utility to restore your DB dump.

  • If you restore a dump from the VM in Yandex Cloud:

    mongorestore --host <DBMS_server_address> \
                 --port <port> \
                 --username <username> \
                 --password "<password>" \
                 -j <number_of_streams> \
                 --authenticationDatabase <DB_name> \
                 --nsInclude '*.*' /tmp/db_dump
    
  • If you restore from a dump stored on a server outside Yandex Cloud, SSL parameters must be explicitly set for mongorestore:

    mongorestore --host <DBMS_server_address> \
                 --port <port> \
                 --ssl \
                 --sslCAFile <path_to_certificate_file> \
                 --username <username> \
                 --password "<password>" \
                 -j <number_of_streams> \
                 --authenticationDatabase <DB_name> \
                 --nsInclude '*.*' ~/db_dump
    
  • If you want to transfer specific collections, set the --nsInclude and --nsExclude flags, specifying the namespaces to include or exclude for the collections to restore.

Was the article helpful?

Previous
Migrating collections from a third-party MongoDB cluster to Yandex StoreDoc
Next
Migrating Yandex StoreDoc cluster from 4.4 to 6.0
© 2025 Direct Cursus Technology L.L.C.