Configuring a network for Yandex Data Processing
In this tutorial, you will learn how to create a Yandex Data Processing cluster and set up subnets and a NAT gateway.
Create resources
You have to create:
- Network.
- Subnet.
- NAT gateway and routing table.
- Security group for the cluster.
- Service account for the cluster.
- Bucket to store job dependencies and results.
- Yandex Data Processing cluster.
-
Create a network named
data-proc-network
with the Create subnets option disabled. -
In
data-proc-network
, create a subnet with the following parameters:- Name:
data-proc-subnet-a
- Zone:
ru-central1-a
- CIDR:
192.168.1.0/24
- Name:
-
Create a NAT gateway and a routing table named
data-proc-route-table
indata-proc-network
. Associate the routing table with thedata-proc-subnet-a
subnet. -
In the
data-proc-network
subnet, create a security group nameddata-proc-security-group
with the following rules:-
One rule for inbound and another one for outbound service traffic:
- Port range:
0-65535
. - Protocol:
Any
. - Source/Destination name:
Security group
- Security group:
Current
.
- Port range:
-
Rule for outgoing HTTPS traffic:
- Port range:
443
. - Protocol:
TCP
. - Destination name:
CIDR
. - CIDR blocks:
0.0.0.0/0
- Port range:
-
Rule that allows access to NTP servers for time syncing:
- Port range:
123
- Protocol:
UDP
. - Destination name:
CIDR
. - CIDR blocks:
0.0.0.0/0
- Port range:
Note
You can add additional rules to a security group to connect to cluster hosts.
-
-
Create a service account named
data-proc-sa
with the following roles: -
Create an Yandex Object Storage bucket with restricted access.
-
Create a Yandex Data Processing cluster with any suitable configuration with the following settings:
- Service account:
data-proc-sa
. - Bucket ID format:
List
. - Bucket name: Select the created bucket.
- Network:
data-proc-network
. - Security groups:
data-proc-security-group
.
- Service account:
-
If you don't have Terraform, install it and configure the Yandex Cloud provider.
-
Download the file with the provider settings
. Place it in a separate working directory and specify the parameter values. -
Download the cluster configuration file
to the same working directory.The file describes:
- Network.
- Subnet.
- NAT gateway and route table.
- Security group.
- Service account to work with cloud resources.
- Bucket to store job dependencies and results.
- Yandex Data Processing cluster.
Note
You can add additional rules to a security group to connect to cluster hosts.
-
In the configuration file, specify all the relevant parameters.
-
Run the
terraform init
command in the working directory hosting the configuration files. This command initializes the provider specified in the configuration files and enables you to use the provider resources and data sources. -
Make sure the Terraform configuration files are correct using this command:
terraform validate
If there are any errors in the configuration files, Terraform will point them out.
-
Create the required infrastructure:
-
Run the command to view planned changes:
terraform plan
If the resource configuration descriptions are correct, the terminal will display a list of the resources to modify and their parameters. This is a test step. No resources are updated.
-
If you are happy with the planned changes, apply them:
-
Run this command:
terraform apply
-
Confirm updating the resources.
-
Wait for the operation to complete.
-
-
All the required resources will be created in the specified folder. You can check the new resources and their configuration using the management console
Delete the resources you created
Some resources are not free of charge. To avoid paying for them, delete the resources you no longer need:
- Delete the Yandex Data Processing cluster.
- If you reserved public static IP addresses for the clusters, release and delete them.
- Delete the subnet.
- Delete the route table.
- Delete the NAT gateway.
- Delete the network.
To delete the infrastructure created with Terraform:
-
In the terminal window, go to the directory containing the infrastructure plan.
-
Delete the
data-proc-nat-gateway.tf
configuration file. -
Make sure the Terraform configuration files are correct using this command:
terraform validate
If there are errors in the configuration files, Terraform will point them out.
-
Confirm updating the resources.
-
Run the command to view planned changes:
terraform plan
If the resource configuration descriptions are correct, the terminal will display a list of the resources to modify and their parameters. This is a test step. No resources are updated.
-
-
If you are happy with the planned changes, apply them:
-
Run this command:
terraform apply
-
Confirm updating the resources.
-
Wait for the operation to complete.
-
All the resources described in the configuration file will be deleted.