Yandex Cloud
Search
Contact UsGet started
  • Blog
  • Pricing
  • Documentation
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • ML & AI
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Customer Stories
    • Gateway to Russia
    • Cloud for Startups
    • Education and Science
  • Blog
  • Pricing
  • Documentation
Yandex project
© 2025 Yandex.Cloud LLC
Yandex Managed Service for Apache Kafka®
  • Getting started
    • All tutorials
    • Deploying the Apache Kafka® web interface
      • Configuring Kafka Connect to work with Managed Service for Apache Kafka®
      • Migrating a database from a third-party Apache Kafka® cluster
      • Moving data between Managed Service for Apache Kafka® clusters using Yandex Data Transfer
    • Working with Apache Kafka® topics using Yandex Data Processing
  • Access management
  • Pricing policy
  • Terraform reference
  • Yandex Monitoring metrics
  • Audit Trails events
  • Public materials
  • Release notes
  • FAQ

In this article:

  • Data migration using Yandex Managed Service for Apache Kafka® Connector
  • Required paid resources
  • Create a cluster and a connector
  • Check the target cluster topic for data
  • Migrating data using MirrorMaker
  • Required paid resources
  • Getting started
  • Configure MirrorMaker
  • Start replication
  • Check the target cluster topic for data
  • Delete the resources you created
  1. Tutorials
  2. Moving data from Apache Kafka®
  3. Migrating a database from a third-party Apache Kafka® cluster

Migrating a database from a third-party Apache Kafka® cluster

Written by
Yandex Cloud
Updated at May 5, 2025
  • Data migration using Yandex Managed Service for Apache Kafka® Connector
  • Required paid resources
    • Create a cluster and a connector
    • Check the target cluster topic for data
  • Migrating data using MirrorMaker
  • Required paid resources
    • Getting started
    • Configure MirrorMaker
    • Start replication
    • Check the target cluster topic for data
    • Delete the resources you created

There are two ways to migrate topics from an Apache Kafka® source cluster to a Managed Service for Apache Kafka® target cluster:

  • Using the built-in Yandex Managed Service for Apache Kafka® MirrorMaker connector.

    This method is easy to configure and does not require you to create an intermediate VM.

  • Using the MirrorMaker 2.0 utility.

    To use this method, first install and configure the utility on an intermediate VM. Use this method only if it is not possible to migrate data using the built-in MirrorMaker connector for whatever reason.

Both methods are also suitable for migrating a Managed Service for Apache Kafka® cluster with one host to a different availability zone.

Data migration using Yandex Managed Service for Apache Kafka® ConnectorData migration using Yandex Managed Service for Apache Kafka® Connector

  1. Create a connector.
  2. Check the target cluster topic for data.

Required paid resourcesRequired paid resources

The support cost includes:

  • Managed Service for Apache Kafka® cluster fee: Using computing resources allocated to hosts (including ZooKeeper hosts) and disk space (see Apache Kafka® pricing).
  • Fee for using public IP addresses if public access is enabled for cluster hosts (see Virtual Private Cloud pricing).

Create a cluster and a connectorCreate a cluster and a connector

Manually
Terraform
  1. Prepare the target cluster:

    • Create an admin user named admin-cloud.
    • Enable Auto create topics enable.
    • Configure security groups if it is required for connection to the target cluster.
  2. In the source cluster, create the admin-source authorized to manage topics via the Admin API.

  3. Make sure that the network hosting the source cluster is configured to allow source cluster connections from the internet.

  4. For the target cluster, create a connector of the MirrorMaker type, configured as follows:

    • Topics: List of topics to migrate. You can also specify a regular expression for selecting topics. To migrate all topics, put .*.

    • Under Source cluster, specify the parameters for connecting to the source cluster:

      • Alias: Source cluster prefix in the connector settings. The default value is source. Topics in the target cluster will be created with the specified prefix.

      • Bootstrap servers: Comma-separated list of source cluster broker host FQDNs with port numbers, for example:

        FQDN1:9091,FQDN2:9091,...,FQDN:9091
        
      • SASL username, SASL password: Username and password of the previously created admin-source user.

      • SASL mechanism: Username and password encryption mechanism, SCRAM-SHA-512.

      • Security protocol: Select a connector connection protocol:

        • SASL_PLAINTEXT: For connecting to the source cluster without SSL.
        • SASL_SSL: For SSL connections to the source cluster.
    • Under Target cluster, select Use this cluster.

  1. If you do not have Terraform yet, install it.

  2. Get the authentication credentials. You can add them to environment variables or specify them later in the provider configuration file.

  3. Configure and initialize a provider. There is no need to create a provider configuration file manually, you can download it.

  4. Place the configuration file in a separate working directory and specify the parameter values. If you did not add the authentication credentials to environment variables, specify them in the configuration file.

  5. Download the kafka-mirrormaker-connector.tf configuration file to the same working directory.

    This file describes:

    • Network.
    • Subnet.
    • Default security group and rules required to connect to the cluster from the internet.
    • Managed Service for Apache Kafka® target cluster with the Auto create topics enable setting on.
    • admin-cloud admin user for the target cluster.
    • MirrorMaker connector for the target cluster.
  6. Specify the following in the kafka-mirrormaker-connector.tf file:

    • Source cluster username and passwords for the source and target cluster users.
    • FQDNs of the source cluster broker hosts.
    • Source and target cluster aliases.
    • Filter template for the topics to be transferred.
    • Apache Kafka® version.
  7. Make sure the Terraform configuration files are correct using this command:

    terraform validate
    

    If there are any errors in the configuration files, Terraform will point them out.

  8. Create the required infrastructure:

    1. Run this command to view the planned changes:

      terraform plan
      

      If you described the configuration correctly, the terminal will display a list of the resources to update and their parameters. This is a verification step that does not apply changes to your resources.

    2. If everything looks correct, apply the changes:

      1. Run this command:

        terraform apply
        
      2. Confirm updating the resources.

      3. Wait for the operation to complete.

    All the required resources will be created in the specified folder. You can check resource availability and their settings in the management console.

Check the target cluster topic for dataCheck the target cluster topic for data

  1. Connect to the target cluster topic using kafkacat. Add the source prefix to the source cluster topic name: for example, the mytopic topic is migrated to the target cluster as source.mytopic.
  2. Make sure the console displays messages from the source cluster topic.

Migrating data using MirrorMakerMigrating data using MirrorMaker

  1. Configure MirrorMaker.
  2. Start replication.
  3. Check the target cluster topic for data.

If you no longer need the resources you created, delete them.

Required paid resourcesRequired paid resources

The support cost includes:

  • Managed Service for Apache Kafka® cluster fee: using computing resources allocated to hosts (including ZooKeeper hosts) and disk space (see Apache Kafka® pricing).
  • Fee for using public IP addresses if public access is enabled for cluster hosts (see Virtual Private Cloud pricing).
  • VM fee: using computing resources, storage, and, optionally, public IP address (see Compute Cloud pricing).

Getting startedGetting started

Prepare the infrastructurePrepare the infrastructure

Manually
Terraform
  1. Create a Managed Service for Apache Kafka® target cluster:

    • With the admin-cloud admin user.
    • With Auto create topics enable activated.
  2. Create a new Linux VM for MirrorMaker on the same network the target cluster is on. To connect to the cluster from the user's local machine rather than doing so from the Yandex Cloud network, enable public access when creating it.

  1. If you do not have Terraform yet, install it.

  2. Get the authentication credentials. You can add them to environment variables or specify them later in the provider configuration file.

  3. Configure and initialize a provider. There is no need to create a provider configuration file manually, you can download it.

  4. Place the configuration file in a separate working directory and specify the parameter values. If you did not add the authentication credentials to environment variables, specify them in the configuration file.

  5. Download the kafka-mirror-maker.tf configuration file to the same working directory.

    This file describes:

    • Network.
    • Subnet.
    • Default security group and rules required to connect to the cluster and VM from the internet.
    • Managed Service for Apache Kafka® cluster with the Auto create topics enable setting on.
    • admin-cloud Apache Kafka® admin user.
    • Virtual machine with public internet access.
  6. Specify the following in the kafka-mirror-maker.tf file:

    • Apache Kafka® version.
    • Apache Kafka® admin user password.
    • ID of the public image with Ubuntu and no GPU, e.g., for Ubuntu 20.04 LTS.
    • Username and path to the public key file for accessing the virtual machine. By default, the specified username is ignored in the image that is currently used. A user with the ubuntu username is created instead. Use it to connect to the VM.
  7. Make sure the Terraform configuration files are correct using this command:

    terraform validate
    

    If there are any errors in the configuration files, Terraform will point them out.

  8. Create the required infrastructure:

    1. Run this command to view the planned changes:

      terraform plan
      

      If you described the configuration correctly, the terminal will display a list of the resources to update and their parameters. This is a verification step that does not apply changes to your resources.

    2. If everything looks correct, apply the changes:

      1. Run this command:

        terraform apply
        
      2. Confirm updating the resources.

      3. Wait for the operation to complete.

    All the required resources will be created in the specified folder. You can check resource availability and their settings in the management console.

Configure additional settingsConfigure additional settings

  1. In the source cluster, create the admin-source authorized to manage topics via the Admin API.

  2. Connect to a virtual machine over SSH.

    1. Install the JDK:

      sudo apt update && sudo apt install --yes default-jdk
      
    2. Download and unpack the Apache Kafka® archive with the same version number as the version installed in the target cluster. For example, for version 2.8:

      wget https://archive.apache.org/dist/kafka/2.8.0/kafka_2.12-2.8.0.tgz && \
      tar -xvf kafka_2.12-2.8.0.tgz
      
    3. Install the kafkacat utility:

      sudo apt update && sudo apt install --yes kafkacat
      

      Make sure that you can use it to connect to the source and target clusters via SSL.

  3. Configure a firewall and security groups if it is required to connect MirrorMaker to the target and the source clusters.

Configure MirrorMakerConfigure MirrorMaker

  1. Connect to the MirrorMaker VM over SSH.

  2. Download an SSL certificate for connecting to the Managed Service for Apache Kafka® cluster.

  3. In the home directory, create a folder named mirror-maker to store Java Keystore certificates and MirrorMaker configuration files:

    mkdir --parents /home/<home_directory>/mirror-maker
    
  4. Choose a password at least 6 characters long for a certificate store, create a store, and add there an SSL certificate for connecting to the cluster:

    sudo keytool --noprompt -importcert -alias YandexCA \
       -file /usr/local/share/ca-certificates/Yandex/YandexInternalRootCA.crt \
       -keystore /home/<home_directory>/mirror-maker/keystore \
       -storepass <certificate_store_password>
    
  5. Create a MirrorMaker configuration file named mm2.properties in the mirror-maker folder:

    # Kafka clusters
    clusters=cloud, source
    source.bootstrap.servers=<source_cluster_broker_FQDN>:9092
    cloud.bootstrap.servers=<source_cluster_broker_1_FQDN>:9091, ..., <source_cluster_broker_N_FQDN>:9091
    
    # Source and target cluster settings
    source->cloud.enabled=true
    cloud->source.enabled=false
    source.cluster.alias=source
    cloud.cluster.alias=cloud
    
    # Internal topics settings
    source.config.storage.replication.factor=<R>
    source.status.storage.replication.factor=<R>
    source.offset.storage.replication.factor=<R>
    source.offsets.topic.replication.factor=<R>
    source.errors.deadletterqueue.topic.replication.factor=<R>
    source.offset-syncs.topic.replication.factor=<R>
    source.heartbeats.topic.replication.factor=<R>
    source.checkpoints.topic.replication.factor=<R>
    source.transaction.state.log.replication.factor=<R>
    cloud.config.storage.replication.factor=<R>
    cloud.status.storage.replication.factor=<R>
    cloud.offset.storage.replication.factor=<R>
    cloud.offsets.topic.replication.factor=<R>
    cloud.errors.deadletterqueue.topic.replication.factor=<R>
    cloud.offset-syncs.topic.replication.factor=<R>
    cloud.heartbeats.topic.replication.factor=<R>
    cloud.checkpoints.topic.replication.factor=<R>
    cloud.transaction.state.log.replication.factor=<R>
    
    # Topics
    topics=.*
    groups=.*
    topics.blacklist=.*[\-\.]internal, .*\replica, __consumer_offsets
    groups.blacklist=console-consumer-.*, connect-.*, __.*
    replication.factor=<M>
    refresh.topics.enable=true
    sync.topic.configs.enabled=true
    refresh.topics.interval.seconds=10
    
    # Tasks
    tasks.max=<T>
    
    # Source cluster authentication parameters. Comment out if no authentication required
    source.client.id=mm2_consumer_test
    source.group.id=mm2_consumer_group
    source.security.protocol=SASL_PLAINTEXT
    source.sasl.mechanism=SCRAM-SHA-512
    source.sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required username="admin-source" password="<password>";
    
    # Target cluster authentication parameters
    cloud.client.id=mm2_producer_test
    cloud.group.id=mm2_producer_group
    cloud.ssl.enabled.protocols=TLSv1.2,TLSv1.1,TLSv1
    cloud.ssl.truststore.location=/home/<home_directory>/mirror-maker/keystore
    cloud.ssl.truststore.password=<certificate_store_password>
    cloud.ssl.protocol=TLS
    cloud.security.protocol=SASL_SSL
    cloud.sasl.mechanism=SCRAM-SHA-512
    cloud.sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required username="admin-cloud" password="<password>";
    
    # Enable heartbeats and checkpoints
    source->target.emit.heartbeats.enabled=true
    source->target.emit.checkpoints.enabled=true
    

    Notes on MirrorMaker configuration:

    • It performs one-way replication (source->cloud.enabled = true, cloud->source.enabled = false).
    • In the topics parameter, list the topics you want to migrate. You can also specify a regular expression for selecting topics. To migrate all topics, put .*. In this configuration, all the topics will be replicated.
    • Topic names in the target cluster are the same as in the source.
    • <R> is the parameter that sets the replication factor for MirrorMaker service topics. The value of this parameter should not exceed the smaller of the number of brokers in the source cluster or the number of brokers in the target cluster.
    • <M> is the default replication factor defined for topics in the target cluster.
    • <T> is the number of concurrent MirrorMaker processes. To distribute replication load evenly, we recommend a value of at least 2. For more information, see the relevant Apache Kafka® documentation.

    You can request Managed Service for Apache Kafka® broker FQDNs with the list of hosts in the cluster.

Start replicationStart replication

Launch MirrorMaker on the VM as follows:

<Apache_Kafka_installation_path>/bin/connect-mirror-maker.sh /home/<home_directory>/mirror-maker/mm2.properties

Check the target cluster topic for dataCheck the target cluster topic for data

  1. Connect to the target cluster topic using kafkacat. Add the source prefix to the source cluster topic name: for example, the mytopic topic is migrated to the target cluster as source.mytopic.
  2. Make sure the console displays messages from the source cluster topic.

To learn more about MirrorMaker 2.0, see the Apache Kafka® documentation.

Delete the resources you createdDelete the resources you created

Delete the resources you no longer need to avoid paying for them:

Manually
Terraform
  • Delete the Yandex Managed Service for Apache Kafka® cluster.
  • Delete the virtual machine.
  • If you reserved public static IP addresses, release and delete them.
  1. In the terminal window, go to the directory containing the infrastructure plan.

    Warning

    Make sure the directory has no Terraform manifests with the resources you want to keep. Terraform deletes all resources that were created using the manifests in the current directory.

  2. Delete resources:

    1. Run this command:

      terraform destroy
      
    2. Confirm deleting the resources and wait for the operation to complete.

    All the resources described in the Terraform manifests will be deleted.

Was the article helpful?

Previous
Configuring Kafka Connect to work with Managed Service for Apache Kafka®
Next
Moving data between Managed Service for Apache Kafka® clusters using Yandex Data Transfer
Yandex project
© 2025 Yandex.Cloud LLC