Yandex Cloud
Search
Contact UsGet started
  • Pricing
  • Customer Stories
  • Documentation
  • Blog
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • AI Studio
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Start testing with double trial credits
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Education and Science
    • Yandex Cloud Partner program
  • Pricing
  • Customer Stories
  • Documentation
  • Blog
© 2025 Direct Cursus Technology L.L.C.
Yandex Data Processing
  • Getting started
  • Access management
  • Pricing policy
  • Terraform reference
  • Monitoring metrics
  • Audit Trails events
  • Public materials
  • FAQ

In this article:

  • Getting started
  • Create a cluster
  • Connect to the cluster
  • Connect to the component interfaces
  • What's next

Getting started with Yandex Data Processing

Written by
Yandex Cloud
Improved by
Danila N.
Updated at July 14, 2025
  • Getting started
  • Create a cluster
  • Connect to the cluster
  • Connect to the component interfaces
  • What's next

To get started:

  1. Create a cluster.
  2. Connect to the cluster.
  3. Connect to component interfaces.

Getting startedGetting started

  1. Navigate to the management console and either log in to Yandex Cloud or sign up if you do not have an account yet.

  2. If you do not have a folder yet, create one:

    1. In the management console, select the appropriate cloud from the list on the left.

    2. At the top right, click Create folder.

    3. Give your folder a name. The naming requirements are as follows:

      • It must be from 2 to 63 characters long.
      • It can only contain lowercase Latin letters, numbers, and hyphens.
      • It must start with a letter and cannot end with a hyphen.
    4. Optionally, specify the description for your folder.

    5. Select Create a default network. This will create a network with subnets in each availability zone. Within this network, you will also have a default security group, within which all network traffic will be allowed.

    6. Click Create.

  3. Assign the following roles to your Yandex Cloud account:

    • dataproc.editor: Required for cluster creation.
    • vpc.user: Required to access the cluster network.
    • iam.serviceAccounts.user: Required to attach a service account to the cluster and create resources using its permissions.

    Note

    If you are unable to manage roles, contact your cloud or organization administrator.

  4. Set up a NAT gateway in the subnet where your cluster will be deployed.

  5. If you use security groups, configure them.

  6. You can access a Yandex Data Processing cluster both from within the Yandex Cloud infrastructure and from external networks:

    • To connect from within Yandex Cloud, create a Linux VM in the cluster’s network.

    • To connect to the cluster from the internet, enable public access for subclusters during cluster creation.

    Note

    The next step requires connecting to the cluster from a Linux-based VM.

  7. Connect to your VM over SSH.

Create a clusterCreate a cluster

To create a cluster:

  1. In the management console, navigate to the folder where you want to create your cluster, then select Yandex Data Processing.
  2. Click Create cluster.
  3. Specify your cluster settings and click Create cluster. For more information, see Creating clusters.
  4. When the cluster is ready for operation, its status will change to Alive. This may take some time.

Connect to the clusterConnect to the cluster

To connect to your cluster:

  1. If you are using security groups for a cloud network, configure them to enable all required traffic between the cluster and the connecting host.

  2. Copy the SSL key you specified during Yandex Data Processing cluster creation to the VM.

  3. Connect to the cluster over SSH and check that Hadoop commands run properly. Depending on your image version, specify the username:

    • For version 2.0, use ubuntu as the username.
    • For version 1.4, use root as the username.

For a detailed description of the Yandex Data Processing cluster connection process, refer to the Connecting to a cluster section.

Connect to the component interfacesConnect to the component interfaces

To connect to the Yandex Data Processing component interfaces using the web UI:

  1. Enable the UI Proxy setting in the cluster.
  2. Get a list of interface URLs.

To connect to the Yandex Data Processing component interfaces via SSH with port forwarding:

  1. Create a jumpbox VM with a public IP address in the cluster’s network, using a security group that allows incoming and outgoing traffic on all component ports.

  2. Connect to the new VM over SSH with port forwarding to the required Yandex Data Processing host ports. Depending on your image version, specify the username:

    • For version 2.0, use ubuntu as the username.
    • For version 1.4, use root as the username.

The detailed process for connecting to the Yandex Data Processing cluster’s component interfaces is described in Connecting to component interfaces.

What's nextWhat's next

  • Read about service concepts.
  • Learn more about creating clusters and working with jobs.

Was the article helpful?

Next
All guides
© 2025 Direct Cursus Technology L.L.C.