Creating a logical replica of Amazon RDS for PostgreSQL in Managed Service for PostgreSQL
You can use logical replication to migrate a database from an Amazon RDS for PostgreSQL source cluster to a Managed Service for PostgreSQL target cluster.
Logical replication
Use logical replication if data migration using Yandex Data Transfer is not possible for any reason.
To migrate a database from an Amazon RDS source cluster for PostgreSQL to a Managed Service for PostgreSQL target cluster:
- Configure Amazon RDS.
- Configure the target cluster and create a subscription.
- Migrate sequences.
- Delete the subscription and transfer the load to the target cluster.
Features of using logical replication
-
Changes to the database schema and DDL are not replicated.
First apply the new schema changes on the subscription
side and then on the publication side. -
Sequences (
SEQUENCES
) are not replicated.When a table is replicated, the data in the
serial
columns or ID columns generated by sequences is also replicated. However, the sequence on the subscriber side will preserve its starting value.If you switch to the subscriber database, restart the sequence using the latest value:
ALTER SEQUENCE serial RESTART WITH <new_value>;
-
When you create a subscription, data from the source tables is fully copied.
To accelerate the copying process, create only the primary key (
PRIMARY KEY
). Create all the other indexes after the copying is complete. -
If a table does not have a primary key, errors will occur during replication:
ERROR: 55000: cannot update table "<table_name>" because it does not have a replica identity and publishes updates HINT: To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.
To run the
UPDATE
andDELETE
replications in tables without the primary key, changeREPLICA IDENTITY
:ALTER TABLE <table_name> REPLICA IDENTITY FULL;
-
External tables are not replicated.
-
If you need to recreate the subscription, clear the tables in the target cluster to prevent primary key constraint errors.
-
To view errors relating to logical replication, see the Managed Service for PostgreSQL logs.
Getting started
Create the required resources:
Create a Managed Service for PostgreSQL cluster with public host access. In which case:
- The PostgreSQL version must be the same or higher than in the source cluster. You cannot perform migration while downgrading PostgreSQL version.
- The name of the database must be the same as in the source cluster.
- Enable the same PostgreSQL extensions as in the source database.
-
If you do not have Terraform yet, install it.
-
Get the authentication credentials. You can add them to environment variables or specify them later in the provider configuration file.
-
Configure and initialize a provider. There is no need to create a provider configuration file manually, you can download it
. -
Place the configuration file in a separate working directory and specify the parameter values. If you did not add the authentication credentials to environment variables, specify them in the configuration file.
-
Download the logical-replica-amazon-rds-to-postgresql.tf
configuration file to the same working directory.This file describes:
- Network.
- Subnet.
- Security group and rule enabling cluster connections.
- Managed Service for PostgreSQL cluster with public internet access.
-
Specify the infrastructure parameters in the
logical-replica-amazon-rds-to-postgresql.tf
configuration file underlocals
:pg_version
: PostgreSQL version. It must be not lower than the Amazon RDS version.db_name
: Target cluster database name. It must be the same as the source database name.username
andpassword
: Name and user password of the database owner.- Names and versions of PostgreSQL extensions used in Amazon RDS. Uncomment and multiply the
extension
section.
-
Check that the Terraform configuration files are correct using this command:
terraform validate
If there are any errors in the configuration files, Terraform will point them out.
-
Create the required infrastructure:
-
Run the command to view planned changes:
terraform plan
If the resource configuration descriptions are correct, the terminal will display a list of the resources to modify and their parameters. This is a test step. No resources are updated.
-
If you are happy with the planned changes, apply them:
-
Run the command:
terraform apply
-
Confirm the update of resources.
-
Wait for the operation to complete.
-
All the required resources will be created in the specified folder. You can check resource availability and their settings in the management console
. -
Configure Amazon RDS
Warning
The DB instance must have public access enabled: Public accessibility = yes
.
-
Set up logical replication.
-
In the
parameter group
of your DB instance, set the parameter as follows:rds.logical_replication = 1
-
Restart the cluster to apply the changes.
-
-
Create a separate user with the
rds_replication
role. To do this, execute the following on behalf of the user with therds_superuser
role:CREATE ROLE <username> WITH LOGIN PASSWORD <password>; GRANT rds_replication TO <username>;
-
Grant the
SELECT
privilege for all the tables involved in the replication:GRANT SELECT ON <table_1>, <table_2>, ..., <table_n> TO <username>;
-
Create a publication:
CREATE PUBLICATION pub FOR TABLE <table_1>, <table_2>, ..., <table_n>;
Note
We do not recommend using the
FOR ALL TABLES
publications as you will not be able to edit the table list later. -
Add a rule for incoming traffic in VPC security groups
. For example:protocol: tcp, port: 5432, source: 84.201.175.90/32
Where
84.201.175.90
is a public IP address.
Configure the target cluster and create a subscription
In Managed Service for PostgreSQL clusters, subscriptions can be used by the database owner (a user created together with the cluster) and users with the mdb_admin
or mdb_superuser
role for the cluster.
-
Optionally, assign the
mdb_admin
ormdb_superuser
role to the Managed Service for PostgreSQL cluster user. -
Create a subscription with the source cluster connection string:
CREATE SUBSCRIPTION s_data_migration CONNECTION 'host=<source_cluster_address> port=<port> user=<username> sslmode=prefer dbname=<DB_name>' PUBLICATION pub;
To learn more about creating subscriptions, see the PostgreSQL documentation
. -
To get the replication status, check the
pg_subscription_rel
folders.SELECT * FROM pg_subscription_rel;
r
in thesrsubstate
field means that the replication is over.
Migrate sequences
To complete synchronization of the source cluster and the target cluster:
-
Switch the source cluster to
read-only
. -
Create a dump with sequences:
pg_dump --host=<source_cluster_address> \ --username=<username> \ --port=<port> \ --dbname=<DB_name> \ --data-only \ --table='*.*_seq' > /tmp/seq-data.sql
Pay attention to the used pattern,
*.*_seq
. If the database you are migrating has sequences that do not match this pattern, enter a different pattern to export them.For more information about patterns, see the PostgreSQL documentation
. -
Restore the dump with sequences in the target cluster:
psql \ --host=<master_host_FQDN_of_target_cluster> \ --username=<username> \ --port=6432 \ --dbname=<DB_name> < /tmp/seq-data.sql
Delete the subscription and transfer the load to the target cluster
-
Delete the subscription in the target cluster:
DROP SUBSCRIPTION s_data_migration;
-
Transfer the load to the target cluster.