Creating a logical replica of Amazon RDS for PostgreSQL in Managed Service for PostgreSQL
You can use logical replication to migrate a database from an Amazon RDS for PostgreSQL source cluster to a Managed Service for PostgreSQL target cluster.
Logical replication
Use logical replication if data migration using Yandex Data Transfer is not possible for any reason.
To migrate a database from an Amazon RDS source cluster for PostgreSQL to a Managed Service for PostgreSQL target cluster:
- Configure Amazon RDS.
- Configure the target cluster and create a subscription.
- Migrate sequences.
- Delete the subscription and transfer the load to the target cluster.
Using logical replication
-
Changes to the database schema and DDL are not replicated.
First apply the new schema changes on the subscription
side and then on the publication side. -
Sequences (
SEQUENCES
) are not replicated.When the table is replicated, the data in the
serial
columns or ID columns that is generated using sequences are also replicated. However, the sequence on the subscriber side will preserve its starting value.If you switch to the subscriber database, restart the sequence using the latest value:
ALTER SEQUENCE serial RESTART WITH <new value>;
-
When you create a subscription, data from the source tables is fully copied.
To accelerate the copy process, only create the primary key (
PRIMARY KEY
). Create all the other indexes after the copying is complete. -
If a table does not have a primary key, errors will occur during replication:
ERROR: 55000: cannot update table "<table_name>" because it does not have a replica identity and publishes updates HINT: To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.
To run
UPDATE
andDELETE
replications on tables without the primary key, change theREPLICA IDENTITY
:ALTER TABLE <table_name> REPLICA IDENTITY FULL;
-
In PostgreSQL version 10, the
TRUNCATE
command is not replicated.You can use
DELETE
instead. -
External tables are not replicated.
-
If you need to recreate the subscription, clear the tables in the target cluster to prevent primary key constraint errors.
-
To view errors relating to logical replication, see the Managed Service for PostgreSQL logs.
Getting started
Create the required resources:
Create a Managed Service for PostgreSQL cluster with public host access. Note that:
- The PostgreSQL version must be the same or higher than the version in the source cluster. You cannot perform migration while downgrading PostgreSQL version.
- The name of the database must be the same as in the source cluster.
- Enable the same PostgreSQL extensions as in the source database.
-
If you do not have Terraform yet, install it.
-
Get the authentication credentials. You can add them to environment variables or specify them later in the provider configuration file.
-
Configure and initialize a provider. There is no need to create a provider configuration file manually, you can download it
. -
Place the configuration file in a separate working directory and specify the parameter values. If you did not add the authentication credentials to environment variables, specify them in the configuration file.
-
Download the logical-replica-amazon-rds-to-postgresql.tf
configuration file to the same working directory.This file describes:
- Network.
- Subnet.
- Security group and rule enabling cluster connections.
- Managed Service for PostgreSQL cluster with public internet access.
-
Specify the infrastructure parameters in the
logical-replica-amazon-rds-to-postgresql.tf
configuration file underlocals
:pg_version
: PostgreSQL version. It must be not lower than the Amazon RDS version.db_name
: Target cluster database name. It must be the same as the source database name.username
andpassword
: Database owner username and password.- Names and versions of PostgreSQL extensions used in Amazon RDS. Uncomment and multiply the
extension
section.
-
Make sure the Terraform configuration files are correct using this command:
terraform validate
If there are any errors in the configuration files, Terraform will point them out.
-
Create the required infrastructure:
-
Run the command to view planned changes:
terraform plan
If the resource configuration descriptions are correct, the terminal will display a list of the resources to modify and their parameters. This is a test step. No resources are updated.
-
If you are happy with the planned changes, apply them:
-
Run the command:
terraform apply
-
Confirm the update of resources.
-
Wait for the operation to complete.
-
All the required resources will be created in the specified folder. You can check resource availability and their settings in the management console
. -
Configure Amazon RDS
Warning
The DB instance must have public access: Public accessibility = yes
.
-
Set up logical replication.
-
Set the parameter to
parameter group
of your DB instance:rds.logical_replication = 1
-
Restart the cluster to apply the changes.
-
-
Create a separate user with the role
rds_replication
. To do this, execute the following statements on behalf of the user with therds_superuser
role:CREATE ROLE <username> WITH LOGIN PASSWORD <passport>; GRANT rds_replication TO <username>;
-
Grant the
SELECT
privilege to all the replicated tables:GRANT SELECT ON <table_1>, <table_2>, ..., <table_N> TO <username>;
-
Create a publication:
CREATE PUBLICATION pub FOR TABLE <table_1>, <table_2>, ..., <table_n>;
Note
We do not recommend using
FOR ALL TABLES
publications as you will not be able to edit the table list later. -
Add a rule for incoming traffic in VPC security groups
, e.g.:protocol: tcp, port: 5432, source: 84.201.175.90/32
Where
84.201.175.90
is a public IP address.
Configure the target cluster and create a subscription
In Managed Service for PostgreSQL clusters, subscriptions can be used by the database owner (a user created together with the cluster) and users with the mdb_admin
role for the cluster.
-
(Optional) Assign the Managed Service for PostgreSQL cluster user the
mdb_admin
role. -
Create a subscription with the source cluster connection string:
CREATE SUBSCRIPTION s_data_migration CONNECTION 'host=<source_cluster_address> port=<port> user=<username> sslmode=prefer dbname=<DB_name>' PUBLICATION pub;
To learn more about creating subscriptions, see the PostgreSQL documentation
. -
To get the replication status, check the
pg_subscription_rel
directories.SELECT * FROM pg_subscription_rel;
The
r
value in thesrsubstate
field means that replication was completed.
Migrate sequences
To complete synchronization of the source cluster and the target cluster:
-
Switch the source cluster to
read-only
mode. -
Create a dump with sequences:
pg_dump --host=<source_cluster_address> \ --username=<username> \ --port=<port> \ --dbname=<DB_name> \ --data-only \ --table='*.*_seq' > /tmp/seq-data.sql
Pay attention to the used
*.*_seq
pattern. If the database you are migrating has sequences that do not match this pattern, enter a different pattern to export them.For more information about patterns, see the PostgreSQL documentation
. -
Restore the dump with sequences in the target cluster:
psql \ --host=<target_cluster_master_host_FQDN> \ --username=<username> \ --port=6432 \ --dbname=<DB_name> < /tmp/seq-data.sql
Delete the subscription and transfer the load to the target cluster
-
Delete the subscription in the target cluster:
DROP SUBSCRIPTION s_data_migration;
-
Transfer the load to the target cluster.