Migrating data from AWS RDS for PostgreSQL to Yandex Managed Service for PostgreSQL
To set up data transfers from Amazon RDS for PostgreSQL
If you no longer need the resources you created, delete them.
Data transfers are supported for PostgreSQL starting with version 9.4. Make sure the PostgreSQL version in Managed Service for PostgreSQL is not older than the PostgreSQL version in Amazon RDS.
Note
Use of Amazon services is not part of the Yandex Cloud Terms of Use
Required paid resources
- Managed Service for PostgreSQL cluster: Computing resources allocated to hosts, storage and backup size (see Managed Service for PostgreSQL pricing).
- Public IP addresses if public access is enabled for cluster hosts (see Virtual Private Cloud pricing).
- Each transfer: Use of computing resources and number of transferred data rows (see Data Transfer pricing).
- NAT gateway: Hourly use and outgoing traffic (see Virtual Private Cloud pricing).
Getting started
Set up the infrastructure:
-
If you do not have an AWS account, create
one. -
In Amazon RDS, create a group of parameters
and set itsrds.logical_replicationparameter to1. You can leave the other parameters at their defaults. -
Create an instance of Amazon RDS for PostgreSQL
(source cluster).When creating an instance, configure it as required:
- Enable public access for the instance.
- In the instance's security group, add a rule which allows incoming TCP traffic from any IP address to the PostgreSQL instance port (
5432by default). - Assign the instance the parameter group you created earlier.
Note
If you changed the parameter group of the created instance, restart the instance for the changes to take effect. While restarting, the instance will be unavailable.
-
Create a Managed Service for PostgreSQL target cluster of any suitable configuration with publicly available hosts and the following settings:
- DB name:
mpg_db. - Username:
mpg_user. - Password:
<target_password>.
- DB name:
-
Make sure that the Managed Service for PostgreSQL cluster's security group has been set up correctly and allows connecting to the cluster from the internet.
-
Set up an egress NAT gateway for the subnet that hosts the target cluster.
-
Download an AWS certificate
for the region where the Amazon RDS for PostgreSQL instance resides.
-
If you do not have Terraform yet, install it.
-
Get the authentication credentials. You can add them to environment variables or specify them later in the provider configuration file.
-
Set up the AWS CLI
. The AWS provider for Terraform uses the AWS CLI configuration to access the service. -
Configure the Terraform provider. There is no need to create a provider configuration file manually, you can download
and save it to a separate working directory. -
Edit the
provider.tffile:-
Specify the parameter values for the
yandexprovider. If you did not add the authentication credentials to environment variables, specify them in the configuration file. -
Add the
awsprovider torequired_providers:required_providers { ... aws = { source = "hashicorp/aws" version = ">= 3.70" } } -
Add a description for the
awsprovider. In the parameters, specify the region where the Amazon RDS for PostgreSQL instance will reside (eu-north-1in this example):provider "aws" { region = "eu-north-1" }
-
-
Download the rds-pg-mpg.tf
configuration file to the same working directory.This file describes:
-
Infrastructure required for the Amazon RDS for PostgreSQL instance to run:
- Subnet group
- Security group rule
- Parameter group
The instance will use the default network, subnets, and security group.
-
Amazon RDS for PostgreSQL instance (source cluster).
-
Infrastructure required for the Managed Service for PostgreSQL target cluster to run:
- Network and subnet
- Egress NAT gateway for the cluster
- Security group
-
Managed Service for PostgreSQL target cluster.
-
Source and target endpoints.
-
Transfer.
-
-
Download an AWS certificate
for the region where the Amazon RDS for PostgreSQL instance will reside. -
In the
rds-pg-mpg.tffile, specify the following:- PostgreSQL versions for Amazon RDS for PostgreSQL and Managed Service for PostgreSQL.
- Parameter family for the Amazon RDS parameter group
. - Path to the previously downloaded AWS certificate.
- Amazon RDS for PostgreSQL and Managed Service for PostgreSQL user passwords.
-
Run the
terraform initcommand in the directory with the configuration file. This command initializes the provider specified in the configuration files and enables you to use its resources and data sources. -
Validate your Terraform configuration files using this command:
terraform validateTerraform will display any configuration errors detected in your files.
-
Create the required infrastructure:
-
Run this command to view the planned changes:
terraform planIf you described the configuration correctly, the terminal will display a list of the resources to update and their parameters. This is a verification step that does not apply changes to your resources.
-
If everything looks correct, apply the changes:
-
Run this command:
terraform apply -
Confirm updating the resources.
-
Wait for the operation to complete.
-
All the required resources will be created in the specified folder. You can check resource availability and their settings in the management console
. -
Prepare your test data
-
Install
psql:sudo apt update && sudo apt install --yes postgresql-client -
Connect to the database in the Amazon RDS for PostgreSQL source cluster:
psql "host=<host_URL> \ port=<PostgreSQL_port> \ sslmode=verify-full \ sslrootcert=<path_to_certificate_file> \ dbname=<DB_name> \ user=<username>"The default PostgreSQL port is
5432.Note
It may take up to an hour after creating the instance for a connection to the instance over the internet to be available.
-
Populate the database with test data. In this example, we will use a simple table with car sensor information.
Create a table:
CREATE TABLE measurements ( device_id varchar(200) NOT NULL, datetime timestamp NOT NULL, latitude real NOT NULL, longitude real NOT NULL, altitude real NOT NULL, speed real NOT NULL, battery_voltage real, cabin_temperature real NOT NULL, fuel_level real, PRIMARY KEY (device_id) );Populate the table with data:
INSERT INTO measurements VALUES ('iv9a94th6rztooxh5ur2', '2022-06-05 17:27:00', 55.70329032, 37.65472196, 427.5, 0, 23.5, 17, NULL), ('rhibbh3y08qmz3sdbrbu', '2022-06-06 09:49:54', 55.71294467, 37.66542005, 429.13, 55.5, NULL, 18, 32);
Set up and activate the transfer
-
Create a
PostgreSQL-type source endpoint and configure it using the following settings:- Installation type:
Custom installation. - Host: Host URL.
- Port:
5432. - CA certificate: Select the AWS certificate file.
- Database:
postgres. - User:
postgres. - Password:
<user_password>.
- Installation type:
-
Create a
PostgreSQLtarget endpoint with these cluster connection settings:- Installation type:
Managed Service for PostgreSQL cluster. - Managed Service for PostgreSQL cluster:
<target_cluster_name>from the drop-down list. - Database:
mpg_db. - User:
mpg_user. - Password:
<user_password>.
- Installation type:
-
Create a Snapshot and replication-type transfer configured to use the new endpoints.
-
Activate the transfer and wait for its status to change to Replicating.
-
In the
rds-pg-mpg.tffile, settransfer_enabledto1. -
Validate your Terraform configuration files using this command:
terraform validateTerraform will display any configuration errors detected in your files.
-
Create the required infrastructure:
-
Run this command to view the planned changes:
terraform planIf you described the configuration correctly, the terminal will display a list of the resources to update and their parameters. This is a verification step that does not apply changes to your resources.
-
If everything looks correct, apply the changes:
-
Run this command:
terraform apply -
Confirm updating the resources.
-
Wait for the operation to complete.
-
-
-
The transfer will activate automatically upon creation. Wait for its status to change to Replicating.
Test your transfer
Make sure the transfer works correctly by testing copying and replication.
Test copying
-
Connect to the Managed Service for PostgreSQL target cluster database.
-
Run this query:
SELECT * FROM measurements;
Test replication
-
Connect to the database in the Amazon RDS for PostgreSQL source cluster:
psql "host=<host_URL> \ port=<PostgreSQL_port> \ sslmode=verify-full \ sslrootcert=<path_to_certificate_file> \ dbname=<DB_name> \ user=<username>"The default PostgreSQL port is
5432. -
Populate the
measurementstable with data:INSERT INTO measurements VALUES ('iv7b74th678tooxdagrf', '2020-06-08 17:45:00', 53.70987913, 36.62549834, 378.0, 20.5, 5.3, 20, NULL); -
Check that the added row appears in the target database:
-
Connect to the Managed Service for PostgreSQL target cluster database.
-
Run this query:
SELECT * FROM measurements;
Note
It may take a few minutes to replicate the data.
-
Delete the resources you created
Note
Before deleting the resources, deactivate the transfer.
To reduce the consumption of resources you do not need, delete them:
-
In the terminal window, go to the directory containing the infrastructure plan.
Warning
Make sure the directory has no Terraform manifests with the resources you want to keep. Terraform deletes all resources that were created using the manifests in the current directory.
-
Delete resources:
-
Run this command:
terraform destroy -
Confirm deleting the resources and wait for the operation to complete.
All the resources described in the Terraform manifests will be deleted.
-