Yandex Cloud
Search
Contact UsGet started
  • Blog
  • Pricing
  • Documentation
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • ML & AI
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Customer Stories
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Education and Science
    • Yandex Cloud Partner program
  • Blog
  • Pricing
  • Documentation
© 2025 Direct Cursus Technology L.L.C.
Yandex Data Processing
  • Getting started
    • All tutorials
      • Configuring a network for Yandex Data Processing
      • Migrating an HDFS cluster to a different availability zone
      • Reconfiguring a network connection when recreating a Yandex Data Processing cluster
  • Access management
  • Pricing policy
  • Terraform reference
  • Monitoring metrics
  • Audit Trails events
  • Public materials
  • FAQ

In this article:

  • Required paid resources
  • Create resources
  • Delete the resources you created
  1. Tutorials
  2. Network settings and cluster maintenance
  3. Configuring a network for Yandex Data Processing

Configuring a network for Yandex Data Processing

Written by
Yandex Cloud
Updated at April 28, 2025
  • Required paid resources
  • Create resources
  • Delete the resources you created

In this tutorial, you will learn how to create a Yandex Data Processing cluster and set up subnets and a NAT gateway.

Required paid resourcesRequired paid resources

The support cost includes:

  • Yandex Data Processing cluster fee: using VM computing resources and Compute Cloud network disks, and Cloud Logging for log management (see Yandex Data Processing pricing).
  • NAT gateway fee (see Virtual Private Cloud pricing).
  • Object Storage bucket fee: storing data and performing operations with it (see Object Storage pricing).

Create resourcesCreate resources

Manually
Terraform
  1. Create a network named data-proc-network with the Create subnets option disabled.

  2. In data-proc-network, create a subnet with the following parameters:

    • Name: data-proc-subnet-a
    • Zone: ru-central1-a
    • CIDR: 192.168.1.0/24
  3. Create a NAT gateway and a route table named data-proc-route-table in data-proc-network. Associate the table with the data-proc-subnet-a subnet:

  4. In the data-proc-network network, create a security group named data-proc-security-group with the following rules:

    • One rule for inbound and another one for outbound service traffic:

      • Port range: 0-65535
      • Protocol: Any
      • Source/Destination name: Security group
      • Security group: Current
    • Rule for outgoing HTTPS traffic:

      • Port range: 443
      • Protocol: TCP
      • Destination name: CIDR
      • CIDR blocks: 0.0.0.0/0
    • Rule that allows access to NTP servers for time syncing:

      • Port range: 123
      • Protocol: UDP
      • Destination name: CIDR
      • CIDR blocks: 0.0.0.0/0

    Note

    You can add additional rules to a security group to connect to cluster hosts.

  5. Create a service account named data-proc-sa with the following roles:

    • dataproc.agent
    • dataproc.provisioner
    • storage.uploader
    • storage.viewer.
  6. Create a Yandex Object Storage bucket with restricted access.

  7. Create a Yandex Data Processing cluster of any suitable configuration with the following settings:

    • Service account: data-proc-sa.
    • Bucket ID format: List.
    • Bucket name: Select the bucket you created earlier.
    • Network: data-proc-network.
    • Security groups: data-proc-security-group.
  1. If you do not have Terraform yet, install it and configure its Yandex Cloud provider.

  2. Download the file with provider settings. Place it in a separate working directory and specify the parameter values.

  3. Download the cluster configuration file to the same working directory.

    This file describes:

    • Network.
    • Subnet.
    • NAT gateway and route table.
    • Security group.
    • Service account to work with cluster resources.
    • Service account for bucket management.
    • Static access key required to grant the service account the required permissions for the bucket.
    • Bucket to store job dependencies and results.
    • Yandex Data Processing cluster.

    Note

    You can add additional rules to a security group to connect to cluster hosts.

  4. In the configuration file, specify all the relevant parameters.

  5. Run the terraform init command in the working directory with the configuration files. This command initializes the provider specified in the configuration files and enables you to use the provider resources and data sources.

  6. Make sure the Terraform configuration files are correct using this command:

    terraform validate
    

    If there are any errors in the configuration files, Terraform will point them out.

  7. Create the required infrastructure:

    1. Run the command to view the planned changes:

      terraform plan
      

      If the resource configurations are correct, the terminal will display a list of the resources to modify and their parameters. This is a test step. No resources are updated.

    2. If you are happy with the planned changes, apply them:

      1. Run this command:

        terraform apply
        
      2. Confirm updating the resources.

      3. Wait for the operation to complete.

All the resources you need will be created in the specified folder. You can check the new resources and their settings using the management console.

Delete the resources you createdDelete the resources you created

Some resources are not free of charge. To avoid paying for them, delete the resources you no longer need:

Manually
Terraform
  1. Delete the Yandex Data Processing cluster.
  2. If you reserved public static IP addresses for the clusters, release and delete them.
  3. Delete the subnet.
  4. Delete the route table.
  5. Delete the NAT gateway.
  6. Delete the network.
  1. In the terminal window, go to the directory containing the infrastructure plan.

    Warning

    Make sure the directory has no Terraform manifests with the resources you want to keep. Terraform deletes all resources that were created using the manifests in the current directory.

  2. Delete resources:

    1. Run this command:

      terraform destroy
      
    2. Confirm deleting the resources and wait for the operation to complete.

    All the resources described in the Terraform manifests will be deleted.

Was the article helpful?

Previous
All tutorials
Next
Migrating an HDFS cluster to a different availability zone
© 2025 Direct Cursus Technology L.L.C.