Migrating data from AWS RDS for PostgreSQL to Yandex Managed Service for PostgreSQL using Yandex Data Transfer
To set up data transfers from Amazon RDS for PostgreSQL
If you no longer need the resources you created, delete them.
Data transfers are supported for PostgreSQL starting with version 9.4. Make sure the PostgreSQL version in Managed Service for PostgreSQL is not older than the PostgreSQL version in Amazon RDS.
Note
Using Amazon services is not part of the Yandex Cloud Terms of Use
Getting started
Prepare the infrastructure:
-
If you do not have an AWS account, create
one. -
In Amazon RDS, create a group of parameters
and set therds.logical_replication
parameter to1
in it. In other parameters, you can leave the defaults. -
Create an instance of Amazon RDS for PostgreSQL
(source cluster).When creating an instance, configure it as required:
- Enable public access for the instance.
- In the instance's security group, add a rule that will allow incoming TCP traffic from any IP address to the PostgreSQL instance port (
5432
by default). - Assign the instance the parameter group you created earlier.
Note
If you changed the parameter group of the created instance, restart the instance for the changes to take effect. While restarting, the instance will be unavailable.
-
Create a Managed Service for PostgreSQL target cluster in any applicable configuration with publicly available hosts and the following settings:
- DB name:
mpg_db
- Username:
mpg_user
- Password:
<target_password>
- DB name:
-
Make sure that the Managed Service for PostgreSQL cluster's security group has been set up correctly and allows connecting to the cluster from the internet.
-
Set up an egress NAT gateway for the subnet that hosts the target cluster.
-
Download an AWS certificate
for the region where the Amazon RDS for PostgreSQL instance resides.
-
If you do not have Terraform yet, install it.
-
Get the authentication credentials. You can add them to environment variables or specify them later in the provider configuration file.
-
Set up the AWS CLI
. The AWS provider for Terraform uses the AWS CLI configuration to access the service. -
Configure the Terraform provider. There is no need to create a provider configuration file manually, you can download
and save it to a separate working directory. -
Edit the
provider.tf
file:-
Set the parameter values for the
yandex
provider. If you did not add the authentication credentials to environment variables, specify them in the configuration file. -
Add the
aws
provider to therequired_providers
section:required_providers { ... aws = { source = "hashicorp/aws" version = ">= 3.70" } }
-
Add a description for the
aws
provider by specifying in the parameters the region where the Amazon RDS for PostgreSQL instance will reside (eu-north-1
in this example):provider "aws" { region = "eu-north-1" }
-
-
Download the rds-pg-mpg.tf
configuration file to the same working directory.This file describes:
-
Infrastructure required for the Amazon RDS for PostgreSQL instance to run:
- Subnet group
- Security group rule
- Parameter group
The instance will use the default network, subnets, and security group.
-
Amazon RDS for PostgreSQL instance (source cluster).
-
Infrastructure required for the Managed Service for PostgreSQL target cluster to run:
- Network and subnet
- Egress NAT gateway for the cluster
- Security group
-
Managed Service for PostgreSQL target cluster.
-
Source and target endpoints.
-
Transfer.
-
-
Download an AWS certificate
for the region where the Amazon RDS for PostgreSQL instance will reside. -
Specify the following in the
rds-pg-mpg.tf
file:- PostgreSQL versions for Amazon RDS for PostgreSQL and Managed Service for PostgreSQL.
- Parameter family for the Amazon RDS parameter group
. - Path to the previously downloaded AWS certificate.
- Amazon RDS for PostgreSQL and Managed Service for PostgreSQL user passwords.
-
Run the
terraform init
command in the directory with the configuration file. This command initializes the provider specified in the configuration files and enables you to use the provider resources and data sources. -
Make sure the Terraform configuration files are correct using this command:
terraform validate
If there are any errors in the configuration files, Terraform will point them out.
-
Create the required infrastructure:
-
Run the command to view planned changes:
terraform plan
If the resource configuration descriptions are correct, the terminal will display a list of the resources to modify and their parameters. This is a test step. No resources are updated.
-
If you are happy with the planned changes, apply them:
-
Run the command:
terraform apply
-
Confirm the update of resources.
-
Wait for the operation to complete.
-
All the required resources will be created in the specified folder. You can check resource availability and their settings in the management console
. -
Prepare the test data
-
Install the
psql
utility:sudo apt update && sudo apt install --yes postgresql-client
-
Connect to the database in the Amazon RDS for PostgreSQL source cluster:
psql "host=<host_URL> \ port=<PostgreSQL_port> \ sslmode=verify-full \ sslrootcert=<path_to_certificate_file> \ dbname=<DB_name> \ user=<username>"
The default PostgreSQL port is
5432
.Note
It may take up to an hour after creating the instance for a connection to the instance over the internet to be available.
-
Add test data to the database. As an example, we will use a simple table with information transmitted by car sensors.
Create a table:
CREATE TABLE measurements ( device_id varchar(200) NOT NULL, datetime timestamp NOT NULL, latitude real NOT NULL, longitude real NOT NULL, altitude real NOT NULL, speed real NOT NULL, battery_voltage real, cabin_temperature real NOT NULL, fuel_level real, PRIMARY KEY (device_id) );
Populate the table with data:
INSERT INTO measurements VALUES ('iv9a94th6rztooxh5ur2', '2022-06-05 17:27:00', 55.70329032, 37.65472196, 427.5, 0, 23.5, 17, NULL), ('rhibbh3y08qmz3sdbrbu', '2022-06-06 09:49:54', 55.71294467, 37.66542005, 429.13, 55.5, NULL, 18, 32);
Prepare and activate the transfer
-
Create a source endpoint of the
PostgreSQL
type and specify the cluster connection parameters in it:- Installation type:
Custom installation
. - Host: Host URL.
- Port:
5432
. - CA certificate: Select the AWS certificate file.
- Database:
postgres
- User:
postgres
- Password:
<user_password>
- Installation type:
-
Create a target endpoint of the
PostgreSQL
type and specify the cluster connection parameters in it:- Installation type:
Managed Service for PostgreSQL cluster
- Managed Service for PostgreSQL cluster:
<name_of_target_cluster>
from the drop-down list - Database:
mpg_db
- User:
mpg_user
- Password:
<user_password>
- Installation type:
-
Create a transfer of the Snapshot and replication type that will use the created endpoints.
-
Activate the transfer and wait for its status to change to Replicating.
-
In the
rds-pg-mpg.tf
file, set thetransfer_enabled
parameter to1
. -
Make sure the Terraform configuration files are correct using this command:
terraform validate
If there are any errors in the configuration files, Terraform will point them out.
-
Create the required infrastructure:
-
Run the command to view planned changes:
terraform plan
If the resource configuration descriptions are correct, the terminal will display a list of the resources to modify and their parameters. This is a test step. No resources are updated.
-
If you are happy with the planned changes, apply them:
-
Run the command:
terraform apply
-
Confirm the update of resources.
-
Wait for the operation to complete.
-
-
-
The transfer will be activated automatically. Wait for its status to change to Replicating.
Test the transfer
Check the transfer performance by testing the copy and replication processes.
Test the copy process
-
Connect to the Managed Service for PostgreSQL target cluster database.
-
Run the following query:
SELECT * FROM measurements;
Test the replication process
-
Connect to the database in the Amazon RDS for PostgreSQL source cluster:
psql "host=<host_URL> \ port=<PostgreSQL_port> \ sslmode=verify-full \ sslrootcert=<path_to_certificate_file> \ dbname=<DB_name> \ user=<username>"
The default PostgreSQL port is
5432
. -
Add data to the
measurements
table:INSERT INTO measurements VALUES ('iv7b74th678tooxdagrf', '2020-06-08 17:45:00', 53.70987913, 36.62549834, 378.0, 20.5, 5.3, 20, NULL);
-
Make sure the new row has been added to the target database:
-
Connect to the Managed Service for PostgreSQL target cluster database.
-
Run the following query:
SELECT * FROM measurements;
Note
It may take a few minutes to replicate the data.
-
Delete the resources you created
Note
Before deleting the created resources, deactivate the transfer.
Some resources are not free of charge. To avoid paying for them, delete the resources you no longer need:
If you created your resources using Terraform:
-
In the terminal window, go to the directory containing the infrastructure plan.
-
Delete the
rds-pg-mpg.tf
configuration file. -
Make sure the Terraform configuration files are correct using this command:
terraform validate
If there are any errors in the configuration files, Terraform will point them out.
-
Confirm updating the resources.
-
Run the command to view planned changes:
terraform plan
If the resource configuration descriptions are correct, the terminal will display a list of the resources to modify and their parameters. This is a test step. No resources are updated.
-
If you are happy with the planned changes, apply them:
-
Run the command:
terraform apply
-
Confirm the update of resources.
-
Wait for the operation to complete.
-
All the resources described in the
rds-pg-mpg.tf
configuration file will be deleted. -