Creating a logical replica of Amazon RDS for PostgreSQL in Managed Service for PostgreSQL
You can use logical replication to migrate a database from an Amazon RDS for PostgreSQL source cluster to a Managed Service for PostgreSQL target cluster.
Logical replication
Use logical replication if data migration using Yandex Data Transfer is not possible for any reason.
To migrate a database from an Amazon RDS source cluster for PostgreSQL to a Managed Service for PostgreSQL target cluster:
- Configure Amazon RDS.
- Configure the target cluster and create a subscription.
- Migrate sequences.
- Delete the subscription and transfer the load to the target cluster.
Required paid resources
The support cost for this solution includes:
- Managed Service for PostgreSQL cluster fee: Using computing resources allocated to hosts and disk space (see Managed Service for PostgreSQL pricing).
- Fee for using public IP addresses for cluster hosts (see Virtual Private Cloud pricing).
Features of using logical replication
-
Changes to the database schema and DDL are not replicated.
First apply the new schema changes on the subscription
side and then on the publication side. -
Sequences (
SEQUENCES) are not replicated.When a table is replicated, the data in the
serialcolumns or ID columns generated by sequences is also replicated. However, the sequence on the subscriber side will preserve its starting value.If you switch to the subscriber database, restart the sequence using the latest value:
ALTER SEQUENCE serial RESTART WITH <new_value>; -
When you create a subscription, data from the source tables is fully copied.
To accelerate the copying process, create only the primary key (
PRIMARY KEY). Create all the other indexes after the copying is complete. -
If a table does not have a primary key, errors will occur during replication:
ERROR: 55000: cannot update table "<table_name>" because it does not have a replica identity and publishes updates HINT: To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.To run the
UPDATEandDELETEreplications in tables without the primary key, changeREPLICA IDENTITY:ALTER TABLE <table_name> REPLICA IDENTITY FULL; -
External tables are not replicated.
-
If you need to recreate the subscription, clear the tables in the target cluster to prevent primary key constraint errors.
-
To view errors relating to logical replication, see the Managed Service for PostgreSQL logs.
Getting started
Create the required resources:
Create a Managed Service for PostgreSQL cluster with public host access. In this case, the following applies:
- The PostgreSQL version must be the same or higher than in the source cluster. You cannot perform migration while downgrading PostgreSQL version.
- The name of the database must be the same as in the source cluster.
- Enable the same PostgreSQL extensions as in the source database.
-
If you do not have Terraform yet, install it.
-
Get the authentication credentials. You can add them to environment variables or specify them later in the provider configuration file.
-
Configure and initialize a provider. There is no need to create a provider configuration file manually, you can download it
. -
Place the configuration file in a separate working directory and specify the parameter values. If you did not add the authentication credentials to environment variables, specify them in the configuration file.
-
Download the logical-replica-amazon-rds-to-postgresql.tf
configuration file to the same working directory.This file describes:
- Network.
- Subnet.
- Security group and rule enabling cluster connections.
- Managed Service for PostgreSQL cluster with public internet access.
-
Specify the infrastructure parameters in the
logical-replica-amazon-rds-to-postgresql.tfconfiguration file underlocals:pg_version: PostgreSQL version. It must be not lower than the Amazon RDS version.db_name: Target cluster database name. It must be the same as the source database name.usernameandpassword: Database owner username and password.- Names and versions of PostgreSQL extensions used in Amazon RDS. Uncomment and multiply the
extensionsection.
-
Make sure the Terraform configuration files are correct using this command:
terraform validateTerraform will show any errors found in your configuration files.
-
Create the required infrastructure:
-
Run this command to view the planned changes:
terraform planIf you described the configuration correctly, the terminal will display a list of the resources to update and their parameters. This is a verification step that does not apply changes to your resources.
-
If everything looks correct, apply the changes:
-
Run this command:
terraform apply -
Confirm updating the resources.
-
Wait for the operation to complete.
-
All the required resources will be created in the specified folder. You can check resource availability and their settings in the management console
. -
Configure Amazon RDS
Warning
The DB instance must have public access enabled: Public accessibility = yes.
-
Set up logical replication.
-
In the
parameter groupof your DB instance, set the parameter as follows:rds.logical_replication = 1 -
Restart the cluster to apply the changes.
-
-
Create a separate user with the
rds_replicationrole. To do this, execute the following on behalf of the user with therds_superuserrole:CREATE ROLE <username> WITH LOGIN PASSWORD <password>; GRANT rds_replication TO <username>; -
Grant the
SELECTprivilege for all the tables involved in the replication:GRANT SELECT ON <table_1>, <table_2>, ..., <table_n> TO <username>; -
Create a publication:
CREATE PUBLICATION pub FOR TABLE <table_1>, <table_2>, ..., <table_n>;Note
We do not recommend using the
FOR ALL TABLESpublications as you will not be able to edit the table list later. -
Add a rule for incoming traffic in VPC security groups
. Here is an example:protocol: tcp, port: 5432, source: 84.201.175.90/32Where
84.201.175.90is a public IP address.
Configure the target cluster and create a subscription
In Managed Service for PostgreSQL clusters, subscriptions can be used by the database owner (a user created together with the cluster) and users with the mdb_admin role for the cluster.
-
Optionally, assign the
mdb_adminrole to the Managed Service for PostgreSQL cluster user. -
Create a subscription with the source cluster connection string:
CREATE SUBSCRIPTION s_data_migration CONNECTION 'host=<source_cluster_address> port=<port> user=<username> sslmode=prefer dbname=<DB_name>' PUBLICATION pub;Tip
If you cannot create a subscription using this request, add the
sslrootcert=noneparameter to it.To learn more about creating subscriptions, see the PostgreSQL documentation
. -
To get the replication status, check the
pg_subscription_relfolders:SELECT * FROM pg_subscription_rel;rin thesrsubstatefield means that the replication is over.
Migrate sequences
To complete synchronization of the source cluster and the target cluster:
-
Switch the source cluster to
read-only
. -
Create a dump with sequences:
pg_dump --host=<source_cluster_address> \ --username=<username> \ --port=<port> \ --dbname=<DB_name> \ --data-only \ --table='*.*_seq' > /tmp/seq-data.sqlPay attention to the used pattern,
*.*_seq. If the database you are migrating has sequences that do not match this pattern, enter a different pattern to export them.For more information about patterns, see the PostgreSQL documentation
. -
Restore the dump with sequences in the target cluster:
psql \ --host=<master_host_FQDN_of_target_cluster> \ --username=<username> \ --port=6432 \ --dbname=<DB_name> < /tmp/seq-data.sql
Delete the subscription and transfer the load to the target cluster
-
Delete the subscription in the target cluster:
DROP SUBSCRIPTION s_data_migration; -
Transfer the load to the target cluster.