Reconfiguring a network connection when recreating a Yandex Data Processing cluster
You may need to recreate a cluster to install software updates, transfer the load across clusters, move clusters from one availability zone to another, and perform other operations.
The example below describes how to set up DNS to quickly switch network traffic over to new host FQDNs when recreating a Yandex Data Processing cluster. For the current name of the cluster master host, a network alias (CNAME record) is created in Yandex Cloud DNS. When you recreate the cluster, the CNAME record is changed to the master host's new name.
To set up DNS for your Yandex Data Processing cluster:
If you no longer need the resources you created, delete them.
Getting started
Prepare the infrastructure:
-
Create a network named
data-proc-network
with the Create subnets option disabled. -
In
data-proc-network
, create a subnet with the following parameters:- Name:
data-proc-subnet-a
- Zone:
ru-central1-a
- CIDR:
192.168.1.0/24
- Name:
-
Create a NAT gateway and a routing table named
data-proc-route-table
indata-proc-network
. Associate the routing table with thedata-proc-subnet-a
subnet. -
In the
data-proc-network
subnet, create a security group nameddata-proc-security-group
with the following rules:-
One rule for inbound and another one for outbound service traffic:
- Port range:
0-65535
- Protocol:
Any
- Source/Destination name:
Security group
- Security group:
Current
- Port range:
-
Rule for outgoing HTTPS traffic:
- Port range:
443
- Protocol:
TCP
- Destination name:
CIDR
- CIDR blocks:
0.0.0.0/0
- Port range:
-
-
Create a service account named
data-proc-sa
with the following roles: -
Create an Yandex Object Storage bucket with restricted access.
-
Create a Yandex Data Processing cluster with any suitable configuration with the following settings:
- Service account:
data-proc-sa
. - Bucket ID format:
List
. - Bucket name: Select the created bucket.
- Network:
data-proc-network
. - Security groups:
data-proc-security-group
.
- Service account:
-
If you do not have Terraform yet, set up and configure it.
-
Download the file with the provider settings
. Place it in a separate working directory and specify the parameter values. -
Download the data-proc-dns-connect.tf
configuration file to the same working directory.The file describes:
- Network.
- Subnet.
- DNS zone and CNAME record for the cluster master host.
- NAT gateway and route table.
- Security groups.
- Service account to work with cloud resources.
- Bucket to store job dependencies and results.
- Yandex Data Processing cluster.
-
In the
data-proc-dns-connect.tf
file, specify the variables:folder_id
: Folder IDpath_to_ssh_public_key
: Path to the public SSH keybucket
: Bucket name
-
Run the
terraform init
command in the working directory hosting the configuration files. This command initializes the provider specified in the configuration files and enables you to use the provider resources and data sources. -
Make sure the Terraform configuration files are correct using this command:
terraform validate
If there are any errors in the configuration files, Terraform will point them out.
-
Create the required infrastructure:
-
Run the command to view planned changes:
terraform plan
If the resource configuration descriptions are correct, the terminal will display a list of the resources to modify and their parameters. This is a test step. No resources are updated.
-
If you are happy with the planned changes, apply them:
-
Run the command:
terraform apply
-
Confirm the update of resources.
-
Wait for the operation to complete.
-
All the required resources will be created in the specified folder. You can check resource availability and their settings in the management console
. -
Create a DNS zone and a CNAME record
Create resources:
-
Create an internal DNS zone with the following settings:
- Zone:
data-proc-test-user.org.
- Networks: Select
data-proc-network
- Name:
dp-private-zone
- Zone:
-
Create a DNS record of the CNAME type with the following settings:
- Name:
data-proc-test-user.org.
- Data: FQDN of the Yandex Data Processing cluster master host
- Name:
-
Get the FQDN of the Yandex Data Processing cluster master host.
-
In the
data-proc-dns-connect.tf
file, specify the variable:dataproc_fqdn
: FQDN of the Yandex Data Processing cluster master host
-
Make sure the Terraform configuration files are correct using this command:
terraform validate
If there are any errors in the configuration files, Terraform will point them out.
-
Create the required infrastructure:
-
Run the command to view planned changes:
terraform plan
If the resource configuration descriptions are correct, the terminal will display a list of the resources to modify and their parameters. This is a test step. No resources are updated.
-
If you are happy with the planned changes, apply them:
-
Run the command:
terraform apply
-
Confirm the update of resources.
-
Wait for the operation to complete.
-
-
Test network access to the cluster by the CNAME record:
dig data-proc-test-user.org.
<...>
;; ANSWER SECTION:
data-proc-test-user.org. 600 IN CNAME rc1a-dataproc-m-6ijqng07vul2mu8j.mdb.yandexcloud.net.
rc1a-dataproc-m-6ijqng07vul2mu8j.mdb.yandexcloud.net. 600 IN A 192.168.1.8
Delete the cluster and recreate it
- Delete the Yandex Data Processing cluster and create a new one with identical characteristics.
- Change the DNS record that you created earlier and specify the FQDN of the master host of the newly created cluster in the Data parameter.
-
Delete the
yandex_dataproc_cluster
section in thedata-proc-dns-connect.tf
file. -
Make sure the Terraform configuration files are correct using this command:
terraform validate
If there are any errors in the configuration files, Terraform will point them out.
-
Apply the changes:
-
Run the command to view planned changes:
terraform plan
If the resource configuration descriptions are correct, the terminal will display a list of the resources to modify and their parameters. This is a test step. No resources are updated.
-
If you are happy with the planned changes, apply them:
-
Run the command:
terraform apply
-
Confirm the update of resources.
-
Wait for the operation to complete.
-
-
-
Add the
yandex_dataproc_cluster
section to thedata-proc-dns-connect.tf
file with the same contents as in the source file to create a new Yandex Data Processing cluster. -
Make sure the Terraform configuration files are correct using this command:
terraform validate
If there are any errors in the configuration files, Terraform will point them out.
-
Create a cluster:
-
Run the command to view planned changes:
terraform plan
If the resource configuration descriptions are correct, the terminal will display a list of the resources to modify and their parameters. This is a test step. No resources are updated.
-
If you are happy with the planned changes, apply them:
-
Run the command:
terraform apply
-
Confirm the update of resources.
-
Wait for the operation to complete.
-
-
-
Get the FQDN of the master host of the newly created Yandex Data Processing cluster.
-
In the
data-proc-dns-connect.tf
file, specify the variable:dataproc_fqdn
: FQDN of the cluster master host
-
Make sure the Terraform configuration files are correct using this command:
terraform validate
If there are any errors in the configuration files, Terraform will point them out.
-
Apply the changes:
-
Run the command to view planned changes:
terraform plan
If the resource configuration descriptions are correct, the terminal will display a list of the resources to modify and their parameters. This is a test step. No resources are updated.
-
If you are happy with the planned changes, apply them:
-
Run the command:
terraform apply
-
Confirm the update of resources.
-
Wait for the operation to complete.
-
-
Check if you still have network access to the cluster by the CNAME record:
dig data-proc-test-user.org.
<...>
;; ANSWER SECTION:
data-proc-test-user.org. 600 IN CNAME rc1a-dataproc-m-lsqohjh53rfu659d.mdb.yandexcloud.net.
rc1a-dataproc-m-8kompl81232cdsu8j.mdb.yandexcloud.net. 600 IN A 192.168.1.8
Delete the resources you created
Some resources are not free of charge. To avoid paying for them, delete the resources you no longer need:
- Delete the Yandex Data Processing cluster.
- If you reserved public static IP addresses for the clusters, release and delete them.
- Delete the subnet.
- Delete the route table.
- Delete the NAT gateway.
- Delete the network.
- Delete the DNS zone.
To delete the infrastructure created with Terraform:
-
In the terminal window, go to the directory containing the infrastructure plan.
-
Delete the
data-proc-dns-connect.tf
configuration file. -
Make sure the Terraform configuration files are correct using this command:
terraform validate
If there are any errors in the configuration files, Terraform will point them out.
-
Confirm updating the resources.
-
Run the command to view planned changes:
terraform plan
If the resource configuration descriptions are correct, the terminal will display a list of the resources to modify and their parameters. This is a test step. No resources are updated.
-
If you are happy with the planned changes, apply them:
-
Run the command:
terraform apply
-
Confirm the update of resources.
-
Wait for the operation to complete.
-
All the resources described in the
data-proc-dns-connect.tf
configuration file will be deleted. -