Creating an Amazon RDS logical replica for PostgreSQL in Managed Service for PostgreSQL
You can use logical replication to migrate a database from an Amazon RDS for PostgreSQL source cluster to a Managed Service for PostgreSQL target cluster.
Logical replication
Use logical replication if data migration via Yandex Data Transfer is not possible for any reason.
To migrate a database from an Amazon RDS for PostgreSQL source cluster to a Managed Service for PostgreSQL target cluster:
- Configure Amazon RDS.
- Configure the target cluster and create a subscription.
- Migrate sequences.
- Delete the subscription and switch the workload to the target cluster.
Required paid resources
- Managed Service for PostgreSQL cluster: computing resources allocated to hosts, storage and backup size (see Managed Service for PostgreSQL pricing).
- Public IP addresses if public access is enabled for cluster hosts (see Virtual Private Cloud pricing).
Logical replication specifics
-
Database schema and DDL changes are not replicated.
Always apply schema changes on the subscription
side first, and only then on the publication side. -
SEQUENCESare not replicated.Data replication for a table includes
serialcolumns and identity columns generated by sequences. However, the sequence on the subscriber side will retain its original starting value.Therefore, when switching to the subscriber database, update the sequence to the latest value:
ALTER SEQUENCE serial RESTART WITH <new_value>; -
By default, when you create a subscription, the system makes a full copy of data from the source tables.
To speed up the copy process, create only the
PRIMARY KEYfirst, postponing the creation of all other indexes until the copying is complete. -
If a table has no primary key, replication will produce errors:
ERROR: 55000: cannot update table "<table_name>" because it does not have a replica identity and publishes updates HINT: To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.To enable the
UPDATEandDELETEreplication on tables with no primary key, change theREPLICA IDENTITY:ALTER TABLE <table_name> REPLICA IDENTITY FULL; -
External tables are not replicated.
-
To prevent primary key violations when recreating a subscription, clear the tables in the target cluster first.
-
Check Managed Service for PostgreSQL logs for logical replication errors.
Getting started
Create the required resources:
Create a Managed Service for PostgreSQL cluster with public access to its hosts. For this operation, the following requirements apply:
- The PostgreSQL version must be the same or higher than the version in the source cluster. Migration to an earlier PostgreSQL version is not supported.
- The database name must be the same as in the source cluster.
- Enable the same PostgreSQL extensions as in the source database.
-
If you do not have Terraform yet, install it.
-
Get the authentication credentials. You can add them to environment variables or specify them later in the provider configuration file.
-
Configure and initialize a provider. There is no need to create a provider configuration file manually, you can download it
. -
Place the configuration file in a separate working directory and specify the parameter values. If you did not add the authentication credentials to environment variables, specify them in the configuration file.
-
Download the logical-replica-amazon-rds-to-postgresql.tf
configuration file to your current working directory.This file describes:
- Network.
- Subnet.
- Security group and a rule allowing cluster connections.
- Managed Service for PostgreSQL cluster with public internet access.
-
Specify the infrastructure settings in the
logical-replica-amazon-rds-to-postgresql.tfconfiguration file underlocals:pg_version: PostgreSQL version. This version must be the same or higher than the Amazon RDS version.db_name: Database name in the target cluster. It must be the same as the source database name.usernameandpassword: Database owner username and password.- Names and versions of PostgreSQL extensions used in Amazon RDS. To list all required extensions, uncomment and copy the
extensionsection.
-
Validate your Terraform configuration files using this command:
terraform validateTerraform will display any configuration errors detected in your files.
-
Create the required infrastructure:
-
Run this command to view the planned changes:
terraform planIf you described the configuration correctly, the terminal will display a list of the resources to update and their parameters. This is a verification step that does not apply changes to your resources.
-
If everything looks correct, apply the changes:
-
Run this command:
terraform apply -
Confirm updating the resources.
-
Wait for the operation to complete.
-
All the required resources will be created in the specified folder. You can check resource availability and their settings in the management console
. -
Configure Amazon RDS
Warning
The database instance must be publicly accessible: Public accessibility = yes.
-
Set up logical replication.
-
In the
parameter groupsection of your database instance, specify the following setting:rds.logical_replication = 1 -
Restart the cluster for the changes to take effect.
-
-
Create a user with the
rds_replicationrole. To do this, execute the following as the user with therds_superuserrole:CREATE ROLE <username> WITH LOGIN PASSWORD <password>; GRANT rds_replication TO <username>; -
Grant the
SELECTprivilege on all the tables included in the replication:GRANT SELECT ON <table_1>, <table_2>, ..., <table_n> TO <username>; -
Create a publication:
CREATE PUBLICATION pub FOR TABLE <table_1>, <table_2>, ..., <table_n>;Note
We do not recommend using the
FOR ALL TABLESpublications as it prevents editing the table list in the future. -
Add an ingress rule to the VPC security groups
. For example:protocol: tcp, port: 5432, source: 84.201.175.90/32Where
84.201.175.90is a public IP address.
Configure the target cluster and create a subscription
In Managed Service for PostgreSQL clusters, subscriptions can be used by the database owner, i.e., a user created alongside the cluster, and users with the mdb_admin role for that cluster.
-
Optionally, assign the
mdb_adminrole to the Managed Service for PostgreSQL cluster user. -
Create a subscription with the following source cluster’s connection string:
CREATE SUBSCRIPTION s_data_migration CONNECTION 'host=<source_cluster_address> port=<port> user=<username> sslmode=prefer dbname=<DB_name>' PUBLICATION pub;Tip
If you cannot create a subscription using this query, add
sslrootcert=noneto it.For details on creating subscriptions, see this PostgreSQL article
. -
To get the replication status, check the
pg_subscription_relfolders:SELECT * FROM pg_subscription_rel;rin thesrsubstatefield indicates that the replication is over.
Migrate sequences
To complete synchronization between the source and target clusters:
-
Set the source cluster to
read-only
mode. -
Create a database dump including sequences:
pg_dump --host=<source_cluster_address> \ --username=<username> \ --port=<port> \ --dbname=<DB_name> \ --data-only \ --table='*.*_seq' > /tmp/seq-data.sqlNote the
*.*_seqpattern we use. If your source database has sequences that do not match this pattern, use a different pattern to export them.To learn more about patterns, see this PostgreSQL article
. -
Restore the dump containing database sequences into your target cluster:
psql \ --host=<master_host_FQDN_of_target_cluster> \ --username=<username> \ --port=6432 \ --dbname=<DB_name> < /tmp/seq-data.sql
Delete the subscription and switch the workload to the target cluster.
-
Delete the subscription in the target cluster:
DROP SUBSCRIPTION s_data_migration; -
Switch the workload to the target cluster.