Migrating a database from a third-party PostgreSQL cluster to Managed Service for PostgreSQL

Written by

Updated at July 24, 2025

Transferring data using Yandex Data Transfer
- Required paid resources
- Transfer the data
Migrating data using logical replication
Transferring data by creating and restoring a logical dump

There are three ways to migrate data from a third-party source cluster to a Managed Service for PostgreSQL target cluster:

Transferring data using Yandex Data Transfer.

This method enables you to:
- Go without creating an intermediate VM or granting online access to your Managed Service for PostgreSQL target cluster.
- Migrate the database completely without interrupting user service.
- Migrate from older PostgreSQL versions to newer ones, including upgrading your cluster from PostgreSQL version 15 to 16.
To use this method, allow connecting to the source cluster from the internet.

For more information, see Problems addressed by Yandex Data Transfer.
Migrating data using logical replication.

Logical replication uses the subscriptions mechanism. It allows you to migrate data to the target cluster with minimal downtime.

Use this method only if, for some reason, it is not possible to migrate data using Yandex Data Transfer.
Transferring data by creating and restoring a logical dump.

A logical dump is a file with a set of commands running which one by one you can restore the state of a database. It is created using the pg_dump utility. To achieve a full logical dump, before you create it, switch the source cluster to read-only.

Use this method only if, for some reason, it is not possible to transfer data using any of the above methods.

Warning

Users are not transferred automatically to a Managed Service for PostgreSQL cluster. You need to create them again in the new cluster.

Transferring data using Yandex Data Transfer

Required paid resources

The support cost includes:

Managed Service for PostgreSQL cluster fee: Using DB hosts and disk space (see Managed Service for PostgreSQL pricing).
Fee for using public IP addresses if public access is enabled for cluster hosts (see Virtual Private Cloud pricing).
Transfer fee: Using computing resources and the number of transferred data rows (see Data Transfer pricing).

Transfer the data

Prepare the source cluster.
Set up your infrastructure:
Manually

Terraform
1. Create a Managed Service for PostgreSQL target cluster in any suitable configuration. In this case, the following applies:
  - The PostgreSQL version must be the same or higher than in the source cluster. You cannot perform migration while downgrading PostgreSQL version.
  - When creating a cluster, specify the same database name as in the source cluster.
  - Enable the same PostgreSQL extensions as in the source cluster.
2. Prepare the target cluster.
3. Create a source endpoint with the following parameters:
  - Database type: PostgreSQL
  - Endpoint parameters → Connection settings: Custom installation
  Specify the parameters for connecting to the source cluster.
4. Create a target endpoint with the following parameters:
  - Database type: PostgreSQL
  - Endpoint parameters → Connection settings: Managed Service for PostgreSQL cluster
  Specify the ID of the target cluster.
5. Create a transfer of the Snapshot and increment type that will use the created endpoints.
6. Activate the transfer.
  
  Warning
  
  Abstain from making any changes to the data schema in the source and target clusters when the data transfer is running. To learn more, see Working with databases during transfer.
1. If you do not have Terraform yet, install it.
2. Get the authentication credentials. You can add them to environment variables or specify them later in the provider configuration file.
3. Configure and initialize a provider. There is no need to create a provider configuration file manually, you can download it.
4. Place the configuration file in a separate working directory and specify the parameter values. If you did not add the authentication credentials to environment variables, specify them in the configuration file.
5. Download the data-transfer-pgsql-mpg.tf configuration file to the same working directory.
  
  This file describes:
  - Network.
  - Subnet.
  - Security group and the rule required to connect to a cluster.
  - Managed Service for PostgreSQL target cluster.
  - Source endpoint.
  - Target endpoint.
  - Transfer.
6. Specify the following in the data-transfer-pgsql-mpg.tf file:
  - Source endpoint parameters.
  - pg-extensions: List of PostgreSQL extensions in the source cluster.
  - Target cluster parameters also used as target endpoint parameters:
    
    target_pgsql_version: PostgreSQL version. Must be the same or higher than in the source cluster.
    
    target_user and target_password: Name and user password of the database owner.
7. Make sure the Terraform configuration files are correct using this command:
  terraform validate
  If there are any errors in the configuration files, Terraform will point them out.
8. Create the required infrastructure:
  1. Run this command to view the planned changes:
    
    terraform plan
    
    If you described the configuration correctly, the terminal will display a list of the resources to update and their parameters. This is a verification step that does not apply changes to your resources.
  2. If everything looks correct, apply the changes:
    
    Run this command:
    
    terraform apply
    
    Confirm updating the resources.
    
    Wait for the operation to complete.
  All the required resources will be created in the specified folder. You can check resource availability and their settings in the management console.
  
  Once created, your transfer will be activated automatically.
Wait for the transfer status to change to Replicating.
Switch the source cluster to read-only.
On the transfer monitoring page, wait for the Maximum data transfer delay metric to decrease to zero. This means that all changes that occurred in the source cluster after data copying was completed are transferred to the target cluster.
Deactivate the transfer and wait for its status to change to Stopped.

For more information about transfer statuses, see Transfer lifecycle.
Transfer the load to the target cluster.
Some resources are not free of charge. To avoid paying for them, delete the resources you no longer need:
Manually created resources

Resources created with Terraform
- Delete the Managed Service for PostgreSQL cluster.
- Delete the stopped transfer.
- Delete the endpoints for both the source and target.
1. In the terminal window, go to the directory containing the infrastructure plan.
  
  Warning
  
  Make sure the directory has no Terraform manifests with the resources you want to keep. Terraform deletes all resources that were created using the manifests in the current directory.
2. Delete resources:
  1. Run this command:
    
    terraform destroy
  2. Confirm deleting the resources and wait for the operation to complete.
  All the resources described in the Terraform manifests will be deleted.

Migrating data using logical replication

Logical replication is supported as of PostgreSQL version 10. Besides migrating data between the same PostgreSQL versions, logical replication allows you to migrate to newer PostgreSQL versions.

In Managed Service for PostgreSQL clusters, subscriptions can be used by the database owner (a user created together with the cluster) and users with the mdb_admin role for the cluster.

Migration stages:

If you no longer need the resources you created, delete them.

Required paid resources

The support cost includes:

Managed Service for PostgreSQL cluster fee: Using computing resources allocated to hosts and disk space (see Managed Service for PostgreSQL pricing).
Fee for using public IP addresses if public access is enabled for cluster hosts (see Virtual Private Cloud pricing).

Getting started

Create the required resources:

Manually

Terraform

Create a Managed Service for PostgreSQL target cluster in any suitable configuration. In this case, the following applies:

The PostgreSQL version must be the same or higher than in the source cluster. You cannot perform migration while downgrading PostgreSQL version.
When creating a cluster, specify the same database name as in the source cluster.
Enable the same PostgreSQL extensions as in the source cluster.

If you do not have Terraform yet, install it.
Get the authentication credentials. You can add them to environment variables or specify them later in the provider configuration file.
Configure and initialize a provider. There is no need to create a provider configuration file manually, you can download it.
Place the configuration file in a separate working directory and specify the parameter values. If you did not add the authentication credentials to environment variables, specify them in the configuration file.
Download the data-migration-pgsql-mpg.tf configuration file to the same working directory.

This file describes:
- Network.
- Subnet.
- Security group and the rule required to connect to a cluster.
- Managed Service for PostgreSQL cluster with public internet access.
Specify the following in the data-migration-pgsql-mpg.tf file:
- target_db_name: Database name.
- pg-extensions: List of PostgreSQL extensions in the source cluster.
- Target cluster parameters:
  - target_pgsql_version: PostgreSQL version. Must be the same or higher than in the source cluster.
  - target_user and target_password: Name and user password of the database owner.
Make sure the Terraform configuration files are correct using this command:
```
terraform validate
```
Terraform will show any errors found in your configuration files.
Create the required infrastructure:
1. Run this command to view the planned changes:
```
terraform plan
```
  If you described the configuration correctly, the terminal will display a list of the resources to update and their parameters. This is a verification step that does not apply changes to your resources.
2. If everything looks correct, apply the changes:
  1. Run this command:
```
terraform apply
```
  2. Confirm updating the resources.
  3. Wait for the operation to complete.
All the required resources will be created in the specified folder. You can check resource availability and their settings in the management console.

Set up the source cluster

Make changes to the source cluster configuration and authentication settings. To do this, edit the postgresql.conf and pg_hba.conf files (on Debian and Ubuntu, they reside in the /etc/postgresql/<PostgreSQL_version>/main/ directory by default):
1. Set the maximum number of user connections. To do this, edit the max_connections parameter in postgresql.conf:
```
max_connections = <number_of_connections>
```
  Where <number_of_connections> is the maximum number of connections. The value of this parameter must be no less than N + 1 where N is the number of all possible connections to your PostgreSQL installation.
  
  The 1 in N + 1 provides an extra connection for the subscription to use for logical replication. If you plan to use multiple subscriptions, specify the relevant value.
  
  In the pg_stat_activity system table, you can see the current number of connections:
```
SELECT count(*) FROM pg_stat_activity;
```
2. Set the logging level for the Write Ahead Log (WAL). To do this, set the wal_level value to logical in postgresql.conf:
```
wal_level = logical
```
3. Optionally, configure SSL to not only encrypt data but also compress it. To enable SSL, set the appropriate value in postgresql.conf:
```
ssl = on
```
4. Enable connections to the cluster. To do this, edit the listen_addresses parameter in postgresql.conf. For example, you can enable the source cluster to accept connection requests from all IP addresses:
```
listen_addresses = '*'
```
5. Set up authentication in the pg_hba.conf file:
  SSL
  
  Without SSL
  hostssl all all <connection_IP_address> md5 hostssl replication all <connection_IP_address> md5
  host all all <connection_IP_address> md5 host replication all <connection_IP_address> md5
  Where <connection_IP_address> can be either an exact IP address or a range of IP addresses. For example, to allow access from the Yandex Cloud network, you can specify all public IP addresses in Yandex Cloud.
If a firewall is enabled in the source cluster, allow incoming connections from the relevant addresses.
To apply the settings, restart PostgreSQL:
```
sudo systemctl restart postgresql
```
Check the PostgreSQL status after restarting:
```
sudo systemctl status postgresql
```

Export the database schema in the source cluster

Use the pg_dump utility to create a file with the database schema to apply in the target cluster.

pg_dump -h <IP_address_or_FQDN_for_master_host_of_source_cluster> \
        -U <user_name> \
        -p <port> \
        --schema-only \
        --no-privileges \
        --no-subscriptions \
        -d <DB_name> \
        -Fd -f /tmp/db_dump

This export command skips all data associated with privileges and roles to avoid conflicts with the database settings in Yandex Cloud. If your database requires additional users, create them.

Restore the database schema in the target cluster

Use the pg_restore utility to restore the database schema in the target cluster:

pg_restore -h <IP_address_or_FQDN_for_master_host_of_target_cluster> \
           -U <user_name> \
           -p 6432 \
           -Fd -v \
           --single-transaction \
           -s --no-privileges \
           -d <DB_name> /tmp/db_dump

Create a publication and subscription

For logical replication to work, create a publication (a group of logically replicated tables) in the source cluster and a subscription (a description of connection to another database) on the target cluster.

On the source cluster, create a publication for all the database tables. When migrating multiple databases, you need to create a separate publication for each of them.

Note

You need superuser rights to create publications to all tables, but not to transfer the selected tables. For more information about creating publications, see the PostgreSQL documentation.

Query:
```
CREATE PUBLICATION p_data_migration FOR ALL TABLES;
```
On the Managed Service for PostgreSQL cluster host, create a subscription with the publication connection string. For more information about creating subscriptions, see the PostgreSQL documentation.

Request with SSL enabled:
```
CREATE SUBSCRIPTION s_data_migration CONNECTION 'host=<source_cluster_address> port=<port> user=<username> sslmode=verify-full dbname=<DB_name>' PUBLICATION p_data_migration;
```
Without SSL:
```
CREATE SUBSCRIPTION s_data_migration CONNECTION 'host=<source_cluster_address> port=<port> user=<username> sslmode=disable dbname=<DB_name>' PUBLICATION p_data_migration;
```
Tip

By default, CREATE SUBSCRIPTION also creates a replication slot. To link a subscription with an existing replication slot without creating a new one, add the create_slot = false parameter to the request.
To get the replication status, check the pg_subscription_rel folders. You can get the general replication status via pg_stat_subscription for the target cluster, and via pg_stat_replication for the source cluster.
```
SELECT * FROM pg_subscription_rel;
```
First of all, check the srsubstate field. There, r means the synchronization is complete and the databases are ready for replication.

Transfer PostgreSQL sequences after replication

To complete synchronization of the source cluster and the target cluster:

Switch the source cluster to read-only.
Create a dump with PostgreSQL-sequences in the source cluster:
```
pg_dump -h <IP_address_or_FQDN_for_master_host_of_source_cluster> \
        -U <user_name> \
        -p <port> \
        -d <DB_name> \
        --data-only -t '*.*_seq' > /tmp/seq-data.sql
```
Pay attention to the *.*_seq pattern used. If the database you are migrating has sequences that do not match this pattern, enter a different pattern to export them.

For more information about patterns, see the PostgreSQL documentation.

Restore the dump with sequences in the target cluster:

psql -h <IP_address_or_FQDN_for_master_host_of_target_cluster> \
     -U <user_name> \
     -p 6432 \
     -d <DB_name> \
     < /tmp/seq-data.sql

Delete the subscription and switch over the load

Delete the subscription in the target cluster:
```
DROP SUBSCRIPTION s_data_migration;
```
Transfer the load to the target cluster.

Delete the resources you created

Delete the resources you no longer need to avoid paying for them:

Manually

Terraform

Delete the Managed Service for PostgreSQL cluster.

In the terminal window, go to the directory containing the infrastructure plan.

Warning

Make sure the directory has no Terraform manifests with the resources you want to keep. Terraform deletes all resources that were created using the manifests in the current directory.
Delete resources:
1. Run this command:
```
terraform destroy
```
2. Confirm deleting the resources and wait for the operation to complete.
All the resources described in the Terraform manifests will be deleted.

Transferring data by creating and restoring a logical dump

Use pg_dump to create a dump of the database in the source cluster. To restore the dump in the target cluster, use pg_restore.

Note

This may require the pg_repack database extension.

Migration stages:

If you no longer need the resources you created, delete them.

Required paid resources

The support cost includes:

Managed Service for PostgreSQL cluster fee: Using computing resources allocated to hosts and disk space (see Managed Service for PostgreSQL pricing).
Fee for using public IP addresses if public access is enabled for cluster hosts (see Virtual Private Cloud pricing).
VM fee: Using computing resources, OS, and storage (see Compute Cloud pricing).
Fee for using a public IP address for a VM (see Virtual Private Cloud pricing).

Getting started

Create the required resources:

Manually

Terraform

Create a Managed Service for PostgreSQL target cluster in any suitable configuration. The following parameters must be the same as in the source cluster:
- Version: PostgreSQL.
- Username.
  
  Note
  
  You may use different usernames for the source and the target. This, however, may result in an error when restoring the dump. For more information, see Moving and restoring a PostgreSQL cluster.
- PostgreSQL extensions.
(Optional step) Create a VM based on Ubuntu 20.04 LTS with the following parameters:
- Disks and file storages → Size: Sufficient to store both archived and unarchived dumps.
  
  The recommended size is two or more times the total dump and dump archive size.
- Network settings:
  - Subnet: Select a subnet on the cloud network hosting the target cluster.
  - Public IP address: Select Auto or one address from a list of reserved IPs.
If you use security groups for the intermediate VM and the Managed Service for PostgreSQL cluster, configure them.

If you do not have Terraform yet, install it.
Get the authentication credentials. You can add them to environment variables or specify them later in the provider configuration file.
Configure and initialize a provider. There is no need to create a provider configuration file manually, you can download it.
Place the configuration file in a separate working directory and specify the parameter values. If you did not add the authentication credentials to environment variables, specify them in the configuration file.
Download the data-restore-pgsql-mpg.tf configuration file to the same working directory.

This file describes:
- Network.
- Subnet.
- Security group and the rule required to connect to a cluster.
- Managed Service for PostgreSQL cluster with public internet access.
- (Optional) Virtual machine with public internet access.
Specify the following in the data-restore-pgsql-mpg.tf file:
- pg-extensions: List of PostgreSQL extensions in the source cluster.
- Target cluster parameters:
  - target_pgsql_version: PostgreSQL version. Must be the same or higher than in the source cluster.
  - target_db_name: Database name.
  - target_user: Username of the database owner. It must be the same as the username in the source cluster.
    
    Note
    
    You may use different usernames for the source and the target. This, however, may result in an error when restoring the dump. For more information, see Moving and restoring a PostgreSQL cluster.
  - target_password: User password of the database owner.
- (Optional) Virtual machine parameters:
  - vm_image_id: ID of the public image with Ubuntu without GPU, e.g., for Ubuntu 20.04 LTS.
  - vm_username and vm_public_key: Username and absolute path to the public key, for access to the VM. By default, the specified username is ignored in the Ubuntu 20.04 LTS image. A user with the ubuntu username is created instead. Use it to connect to the VM.
Make sure the Terraform configuration files are correct using this command:
```
terraform validate
```
Terraform will show any errors found in your configuration files.
Create the required infrastructure:
1. Run this command to view the planned changes:
```
terraform plan
```
  If you described the configuration correctly, the terminal will display a list of the resources to update and their parameters. This is a verification step that does not apply changes to your resources.
2. If everything looks correct, apply the changes:
  1. Run this command:
```
terraform apply
```
  2. Confirm updating the resources.
  3. Wait for the operation to complete.
All the required resources will be created in the specified folder. You can check resource availability and their settings in the management console.

Create a database dump

Switch the database to read-only.

Create a dump using the pg_dump utility. To speed it up, run the utility in multithreaded mode by providing the number of available CPU cores in the --jobs argument:

pg_dump --host=<IP_address_or_FQDN_for_master_host_of_source_cluster> \
        --port=<port> \
        --username=<user_name> \
        --jobs=<number_of_CPU_cores> \
        --format=d \
        --dbname=<DB_name> \
        --file=db_dump

(Optional) Upload the dump to a virtual machine in Yandex Cloud

Transfer your data to an intermediate VM in Compute Cloud if:

Your Managed Service for PostgreSQL cluster is not accessible from the internet.
Your hardware or connection to the cluster in Yandex Cloud is not very reliable.

The required amount of RAM and processor cores depends on the amount of data to migrate and the required migration speed.

To prepare the virtual machine to restore the dump:

In the management console, create a new VM from an Ubuntu 20.04 image on Marketplace. The VM parameters depend on the size of the database you want to migrate. The minimum configuration (1 core, 2 GB RAM, 10 GB disk space) should be sufficient to migrate a database up to 1 GB in size. The larger the database, the more RAM and storage space you need for migration (at least twice the size of the database).

The VM must be in the same network and availability zone as the PostgreSQL cluster. Additionally, the VM must be assigned a public IP address so that you can load the dump from outside Yandex Cloud.
Set up the PostgreSQL apt repository.

Install the PostgreSQL client and additional utilities for working with the DBMS:

sudo apt install postgresql-client-common && \
sudo apt install postgresql-client-<PostgreSQL_version>

Archive the dump:
```
tar -cvzf db_dump.tar.gz db_dump
```
Move the archive containing the dump to the VM, e.g., using the scp utility:
```
scp db_dump.tar.gz <VM_user_name>@<VM_public_address>:/db_dump.tar.gz
```
Connect to the VM.
Unpack the archive with the dump:
```
tar -xzf db_dump.tar.gz
```

Restore data from the dump to the target cluster

Restore the database dump using the pg_restore utility.

The pg_restore version must match that of pg_dump, and the major version must be at least as high that of the DB the dump will be deployed on.

That is, to restore a dump of PostgreSQL 10, PostgreSQL 11, PostgreSQL 12, PostgreSQL 13, or PostgreSQL 14 use pg_restore 10, pg_restore 11, pg_restore 12, pg_restore 13, or pg_restore 14, respectively.

pg_restore --host=<IP_address_or_FQDN_for_master_host_of_target_cluster> \
           --username=<user_name> \
           --dbname=<DB_name> \
           --port=6432 \
           --format=d \
           --verbose \
           db_dump \
           --single-transaction \
           --no-privileges

If you only need to restore a single schema, add the --schema=<schema_name> parameter. Without this parameter, the command will only run on behalf of the database owner.

If the restoration fails due to errors related to lack of required permissions for creating and updating extensions, remove the --single-transaction parameter from the command. The errors will be ignored in this case:

pg_restore: warning: errors ignored on restore: 3

Make sure the errors only apply to the extensions and check the integrity of your restored data.

Delete the resources you created

Delete the resources you no longer need to avoid paying for them:

Manually

Terraform

Delete the Yandex Managed Service for PostgreSQL cluster.
If you created an intermediate virtual machine, delete it.
If you reserved public static IP addresses, release and delete them.

In the terminal window, go to the directory containing the infrastructure plan.

Warning

Make sure the directory has no Terraform manifests with the resources you want to keep. Terraform deletes all resources that were created using the manifests in the current directory.
Delete resources:
1. Run this command:
```
terraform destroy
```
2. Confirm deleting the resources and wait for the operation to complete.
All the resources described in the Terraform manifests will be deleted.

Migrating a database from a third-party PostgreSQL cluster to Managed Service for PostgreSQL

Transferring data using Yandex Data TransferTransferring data using Yandex Data Transfer

Required paid resourcesRequired paid resources

Transfer the dataTransfer the data

Migrating data using logical replicationMigrating data using logical replication

Required paid resourcesRequired paid resources

Getting startedGetting started

Set up the source clusterSet up the source cluster

Export the database schema in the source clusterExport the database schema in the source cluster

Restore the database schema in the target clusterRestore the database schema in the target cluster

Create a publication and subscriptionCreate a publication and subscription

Transfer PostgreSQL sequences after replicationTransfer PostgreSQL sequences after replication

Delete the subscription and switch over the loadDelete the subscription and switch over the load

Delete the resources you createdDelete the resources you created

Transferring data by creating and restoring a logical dumpTransferring data by creating and restoring a logical dump

Required paid resourcesRequired paid resources

Getting startedGetting started

Create a database dumpCreate a database dump

(Optional) Upload the dump to a virtual machine in Yandex Cloud(Optional) Upload the dump to a virtual machine in Yandex Cloud

Restore data from the dump to the target clusterRestore data from the dump to the target cluster

Delete the resources you createdDelete the resources you created

Was the article helpful?

Transferring data using Yandex Data Transfer

Required paid resources

Transfer the data

Migrating data using logical replication

Required paid resources

Getting started

Set up the source cluster

Export the database schema in the source cluster

Restore the database schema in the target cluster

Create a publication and subscription

Transfer PostgreSQL sequences after replication

Delete the subscription and switch over the load

Delete the resources you created

Transferring data by creating and restoring a logical dump

Required paid resources

Getting started

Create a database dump

(Optional) Upload the dump to a virtual machine in Yandex Cloud

Restore data from the dump to the target cluster

Delete the resources you created