Configuring a network for Yandex Data Processing
In this tutorial, you will learn how to create a Yandex Data Processing cluster and set up subnets and a NAT gateway.
Create resources
You have to create:
- Network.
- Subnet.
- NAT gateway and routing table.
- Security group for the cluster.
- Service account for the cluster.
- Bucket to store job dependencies and results.
- Yandex Data Processing cluster.
-
Create a network named
data-proc-network
with the Create subnets option disabled. -
In
data-proc-network
, create a subnet with the following parameters:- Name:
data-proc-subnet-a
. - Zone:
ru-central1-a
. - CIDR:
192.168.1.0/24
.
- Name:
-
Create a NAT gateway and a routing table named
data-proc-route-table
indata-proc-network
. Associate the table with thedata-proc-subnet-a
subnet. -
In the
data-proc-network
network, create a security group nameddata-proc-security-group
with the following rules:-
One rule for inbound and another one for outbound service traffic:
- Port range:
0-65535
. - Protocol:
Any
. - Source/Destination name:
Security group
. - Security group:
Current
.
- Port range:
-
Rule for outgoing HTTPS traffic:
- Port range:
443
. - Protocol:
TCP
. - Destination name:
CIDR
. - CIDR blocks:
0.0.0.0/0
.
- Port range:
-
Rule that allows access to NTP servers for time syncing:
- Port range:
123
. - Protocol:
UDP
. - Destination name:
CIDR
. - CIDR blocks:
0.0.0.0/0
.
- Port range:
Note
You can add additional rules to a security group to connect to cluster hosts.
-
-
Create a service account named
data-proc-sa
with the following roles: -
Create an Yandex Object Storage bucket with restricted access.
-
Create a Yandex Data Processing cluster of any suitable configuration with the following settings:
- Service account:
data-proc-sa
. - Bucket ID format:
List
. - Bucket name: Select the bucket you created earlier.
- Network:
data-proc-network
. - Security groups:
data-proc-security-group
.
- Service account:
-
If you don't have Terraform, install it and configure the Yandex Cloud provider.
-
Download the file with provider settings
. Place it in a separate working directory and specify the parameter values. -
Download the cluster configuration file
to the same working directory.The file describes:
- Network.
- Subnet.
- NAT gateway and route table.
- Security group.
- Service account to work with cloud resources.
- Bucket to store job dependencies and results.
- Yandex Data Processing cluster.
Note
You can add additional rules to a security group to connect to cluster hosts.
-
In the configuration file, specify all the relevant parameters.
-
Run the
terraform init
command in the working directory with the configuration files. This command initializes the provider specified in the configuration files and enables you to use the provider resources and data sources. -
Check that the Terraform configuration files are correct using this command:
terraform validate
If there are any errors in the configuration files, Terraform will point them out.
-
Create the required infrastructure:
-
Run the command to view planned changes:
terraform plan
If the resource configuration descriptions are correct, the terminal will display a list of the resources to modify and their parameters. This is a test step. No resources are updated.
-
If you are happy with the planned changes, apply them:
-
Run this command:
terraform apply
-
Confirm updating the resources.
-
Wait for the operation to complete.
-
-
All the required resources will be created in the specified folder. You can check the new resources and their settings using the management console
Delete the resources you created
Some resources are not free of charge. To avoid paying for them, delete the resources you no longer need:
- Delete the Yandex Data Processing cluster.
- If you reserved public static IP addresses for the clusters, release and delete them.
- Delete the subnet.
- Delete the route table.
- Delete the NAT gateway.
- Delete the network.
-
In the terminal window, go to the directory containing the infrastructure plan.
Warning
Make sure the directory has no Terraform manifests with the resources you want to keep. Terraform deletes all resources that were created using the manifests in the current directory.
-
Delete resources:
-
Run this command:
terraform destroy
-
Confirm deleting the resources and wait for the operation to complete.
All the resources described in the Terraform manifests will be deleted.
-