Migrating a Yandex Managed Service for PostgreSQL cluster to a different version using Yandex Data Transfer
You can migrate a loaded production database deployed in a Managed Service for PostgreSQL cluster to a higher version cluster. This tutorial covers migration from version 13 directly to version 17, without going stepwise through multiple versions (13 → 14 → 15 → 16 → 17).
To transfer data:
- Prepare the source cluster.
- Prepare the target cluster.
- Prepare and activate the transfers.
- Switch to the new cluster.
- Check the data transfer.
If you no longer need the resources you created, delete them.
Required paid resources
- PostgreSQL cluster, which includes computing resources allocated to hosts, storage and backup size (see PostgreSQL pricing).
- Public IP addresses if public access is enabled for cluster hosts (see Virtual Private Cloud pricing).
- Each transfer, which includes the use of computing resources and the number of transferred data rows (see Data Transfer pricing).
Prepare the source cluster
-
Prepare the source database for migration as per this guide.
-
Estimate your database workload. If it exceeds 10,000 writes per second, plan several transfers.
- Identify the high-workload tables.
- Distribute the tables between several transfers.
Prepare the target cluster
-
Create a Managed Service for PostgreSQL target cluster:
ManuallyUsing TerraformCreate a Managed Service for PostgreSQL target cluster with the same configuration as the source cluster and with the following settings:
- Cluster version:
17. - Database name:
db1. - Username:
user1.
If you intend to connect to the cluster from the internet, enable public access to the cluster hosts.
-
If you do not have Terraform yet, install it.
-
Get the authentication credentials. You can add them to environment variables or specify them later in the provider configuration file.
-
Configure and initialize a provider. There is no need to create a provider configuration file manually, you can download it
. -
Place the configuration file in a separate working directory and specify the parameter values. If you did not add the authentication credentials to environment variables, specify them in the configuration file.
-
In your current working directory, create a
.tffile with the following contents:resource "yandex_mdb_postgresql_cluster" "old" { } -
Write the PostgreSQL version 13 cluster ID to an environment variable:
export POSTGRESQL_CLUSTER_ID=<cluster_ID>You can get the cluster ID with the list of clusters in the folder.
-
Import the PostgreSQL version 13 cluster settings into the Terraform configuration:
terraform import yandex_mdb_postgresql_cluster.old ${POSTGRESQL_CLUSTER_ID} -
Get the imported configuration:
terraform show -
Copy it from the terminal and paste it into the
.tffile. -
Place the file in the new
imported-clusterdirectory. -
Edit the copied configuration so that you can create a new cluster from it:
- Specify the new cluster name in the
resourcestring and in thenameargument. - Under
config, setversionto17. - Delete
created_at,health,id, andstatus. - In the
hostsections, delete thefqdn,role, andpriorityarguments. - If the
disk_size_autoscalingsection hasdisk_size_limit = 0, delete this section. - If the
maintenance_windowsection containstype = "ANYTIME", delete thehourargument. - Optionally, make further changes if you need to customize the configuration.
- Specify the new cluster name in the
-
Add to the file a resource to create a user named
user1:resource "yandex_mdb_postgresql_user" "user1" { cluster_id = yandex_mdb_postgresql_cluster.<cluster_name>.id name = "user1" password = "<user_password>" }Where
<cluster_name>is the new cluster name specified in theyandex_mdb_postgresql_clusterresource. -
Add to the file a resource to create the database:
resource "yandex_mdb_postgresql_database" "db1" { cluster_id = yandex_mdb_postgresql_cluster.<cluster_name>.id name = "db1" owner = yandex_mdb_postgresql_user.user1.name depends_on = [yandex_mdb_postgresql_user.user1] }Where
<cluster_name>is the new cluster name specified in theyandex_mdb_postgresql_clusterresource. -
Get the authentication credentials in the
imported-clusterdirectory. -
In the same directory, configure and initialize the provider. Download
the provider configuration file rather than creating it manually. -
Place the configuration file in the
imported-clusterdirectory and specify the parameter values. If you have not set the authentication credentials as environment variables, specify them in the configuration file. -
Validate your Terraform configuration:
terraform validateTerraform will display any configuration errors detected in your files.
-
Create the required infrastructure:
-
Run this command to view the planned changes:
terraform planIf you described the configuration correctly, the terminal will display a list of the resources to update and their parameters. This is a verification step that does not apply changes to your resources.
-
If everything looks correct, apply the changes:
-
Run this command:
terraform apply -
Confirm updating the resources.
-
Wait for the operation to complete.
-
All the required resources will be created in the specified folder. You can check resource availability and their settings in the management console
. -
Timeouts
The Terraform provider sets the following timeouts for Managed Service for PostgreSQL cluster operations:
- Creating a cluster, including restoration from a backup: 30 minutes.
- Updating a cluster: 60 minutes.
- Deleting a cluster: 15 minutes.
Operations exceeding the timeout are aborted.
How can I change these timeouts?
Add a
timeoutssection to the cluster description, e.g.:resource "yandex_mdb_postgresql_cluster" "<cluster_name>" { ... timeouts { create = "1h30m" # 1 hour 30 minutes update = "2h" # 2 hours delete = "30m" # 30 minutes } } - Cluster version:
-
If using security groups, make sure they are configured correctly and allow connections to your cluster.
-
Prepare the target database for migration as per this guide.
Prepare and activate the transfers
-
Create a source endpoint for each scheduled transfer and specify the endpoint parameters:
- Database type:
PostgreSQL. - Connection type:
Manual setup. - Installation type:
Managed Service for PostgreSQL cluster. - Managed DB cluster:
<source_cluster_name>from the drop-down list. - Database:
<source_cluster_database_name>. - Username:
<username>. - Password:
<password>. - List of included tables: For each endpoint, give a list of included tables as per your allocation plan for each transfer.
Under Schema transfer, make sure you have
After data migrationset for foreign keys and indexes. In which case your foreign keys and indexes will be transferred at the transfer deactivation stage. - Database type:
-
Create a target endpoint for each planned transfer and specify endpoint parameters:
- Database type:
PostgreSQL. - Connection type:
Manual setup. - Installation type:
Managed Service for PostgreSQL cluster. - Managed DB cluster:
db1from the drop-down list. - Username:
user1. - Password:
<password>.
- Database type:
-
Create transfers of the Snapshot and replication type that will use the created endpoints.
To speed up the copying of large tables (over 100 GB), configure parallel copying for the transfer by specifying the required numbers of workers and streams.
The table will be split into the specified number of parts that will be copied in parallel.
-
In the
imported-clusterfolder, open the Terraform configuration file describing your infrastructure. -
Add to the file a resource to create the source endpoint.
resource "yandex_datatransfer_endpoint" "<endpoint_name>" { name = "<endpoint_name>" settings { postgres_source { connection { mdb_cluster_id = "<source_cluster_ID>" } database = "<DB_name>" user = "<username>" password { raw = "<password>" } include_tables = ["<schema>.<table_1>", ... , "<schema>.<table_N>"] object_transfer_settings { fk_constraint = "AFTER_DATA" index = "AFTER_DATA" } } } }If you have scheduled multiple transfers, add a separate endpoint for each one. For each source endpoint, in the
include_tablesparameter, give a list of included tables as per your allocation plan for each transfer.The
object_transfer_settingssection specifies the schema transfer parameters. If set toAFTER_DATA, foreign keys and indexes will be migrated after the data is migrated (at the transfer deactivation stage). -
Add to the file a resource to create the target endpoint.
resource "yandex_datatransfer_endpoint" "<endpoint_name>" { name = "<endpoint_name>" settings { postgres_target { connection { mdb_cluster_id = yandex_mdb_postgresql_cluster.<cluster_name>.id } database = "db1" user = "user1" password { raw = "<password>" } } } }Where
<cluster_name>is the cluster name specified in theyandex_mdb_postgresql_clusterresource.If you have scheduled multiple transfers, add a separate endpoint for each one.
-
Add to the file a resource to create a transfer that will use your new endpoints.
resource "yandex_datatransfer_transfer" "<transfer_name>" { name = "<transfer_name>" source_id = yandex_datatransfer_endpoint.<source_endpoint_name>.id target_id = yandex_datatransfer_endpoint.<target_endpoint_name>.id type = "SNAPSHOT_AND_INCREMENT" runtime { yc_runtime { upload_shard_params { job_count = <number_of_workers> process_count = <number_of_streams> } } } }Where:
-
source_id: Source endpoint link. -
target_id: Target endpoint link. -
type: Transfer type.SNAPSHOT_AND_INCREMENT: Snapshot and replication. -
runtime.yc_runtime.upload_shard_params: Parallel copy settings. This option speeds up the copying of large tables (over 100 GB).job_count: Number of workers.process_count: Number of streams.
If you have distributed your tables between several endpoint pairs, create a separate transfer for each pair.
Your transfers will be started automatically as soon as they are created.
-
Switch to the new cluster
-
Wait for the transfer status to change to Replicating.
-
Remove the writing load from the source cluster.
-
On the transfer monitoring page, wait for the Maximum data transfer delay metric to reach zero for each transfer. This indicates that the target cluster now contains all changes made in the source cluster after the data copy completed.
-
Switch the workload to the target cluster.
-
Deactivate the transfers and wait for their status to change to Stopped.
During deactivation, foreign keys and indexes are being created. This may take a while. The larger your database, the longer the deactivation time.
Check the data transfer
-
Connect to the
db1database in the Managed Service for PostgreSQL target cluster. -
Run this query to make sure the tables have appeared in the
db1database:SELECT schemaname AS schema, tablename AS table_name FROM pg_tables WHERE schemaname NOT IN ('pg_catalog', 'information_schema') AND tablename NOT LIKE 'pg\_%' ORDER BY schemaname, tablename;The query will return a list of all non-system tables.
Delete the resources you created
Some resources are not free of charge. Delete the resources you no longer need to avoid paying for them:
-
In the terminal window, go to the directory containing the infrastructure plan.
Warning
Make sure the directory has no Terraform manifests with the resources you want to keep. Terraform deletes all resources that were created using the manifests in the current directory.
-
Delete resources:
-
Run this command:
terraform destroy -
Confirm deleting the resources and wait for the operation to complete.
All the resources described in the Terraform manifests will be deleted.
-