Yandex Cloud
Search
Contact UsGet started
  • Blog
  • Pricing
  • Documentation
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • ML & AI
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Customer Stories
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Education and Science
    • Yandex Cloud Partner program
  • Blog
  • Pricing
  • Documentation
© 2025 Direct Cursus Technology L.L.C.
Yandex Data Processing
  • Getting started
  • Access management
  • Pricing policy
  • Terraform reference
  • Monitoring metrics
  • Audit Trails events
  • Public materials
  • FAQ

In this article:

  • Getting started
  • Create a cluster
  • Connect to the cluster
  • Connect to component interfaces
  • What's next

Getting started with Yandex Data Processing

Written by
Yandex Cloud
Improved by
Danila N.
Updated at April 22, 2025
  • Getting started
  • Create a cluster
  • Connect to the cluster
  • Connect to component interfaces
  • What's next

To get started with the service:

  1. Create a cluster.
  2. Connect to the cluster.
  3. Connect to component interfaces.

Getting startedGetting started

  1. Go to the management console and log in to Yandex Cloud or sign up if not signed up yet.

  2. If you do not have a folder yet, create one:

    1. In the management console, select the appropriate cloud from the list on the left.

    2. At the top right, click Create folder.

    3. Give your folder a name. The naming requirements are as follows:

      • It must be from 2 to 63 characters long.
      • It may contain lowercase Latin letters, numbers, and hyphens.
      • It must start with a letter and cannot end with a hyphen.
    4. Optionally, specify the description for your folder.

    5. Select Create a default network. This will create a network with subnets in each availability zone. Within this network, you will also have a default security group, inside which all network traffic will be allowed.

    6. Click Create.

  3. Assign the following roles to your Yandex Cloud account:

    • dataproc.editor: To create a cluster.
    • vpc.user: To use the cluster network.
    • iam.serviceAccounts.user: To link a service account to the cluster and create resources under that service account.

    Note

    If you are unable to manage roles, contact your cloud or organization administrator.

  4. Set up a NAT gateway in the subnet to host the cluster.

  5. If you use security groups, configure them.

  6. You can connect to a Yandex Data Processing cluster from both inside and outside Yandex Cloud:

    • To connect from inside Yandex Cloud, create a Linux VM in the same network as the cluster.

    • To be able to connect to the cluster from the internet, request public access to subclusters when creating the cluster.

    Note

    The next step assumes that you connect to the cluster from a Linux-based VM.

  7. Connect to the VM over SSH.

Create a clusterCreate a cluster

To create a cluster:

  1. In the management console, open the folder to create your cluster in and select Yandex Data Processing.
  2. Click Create cluster.
  3. Set the cluster parameters and click Create cluster. For more information, see Creating clusters.
  4. Wait until the cluster is ready for use: its status will change to Alive. This may take some time.

Connect to the clusterConnect to the cluster

To connect to a cluster:

  1. If you are using security groups for a cloud network, configure them to enable all relevant traffic between the cluster and the connecting host.

  2. Copy the SSL key that you specified when creating the Yandex Data Processing cluster to the VM.

  3. Connect to the cluster via SSH and make sure that Hadoop commands are executed. Depending on the image version, specify the username:

    • For version 2.0: ubuntu
    • For version 1.4: root

For more information about connecting to a Yandex Data Processing cluster, see Connecting to a cluster.

Connect to component interfacesConnect to component interfaces

To connect to the Yandex Data Processing component interfaces using the web interface:

  1. Enable the UI Proxy setting in the cluster.
  2. Get a list of interface URLs.

To connect to the Yandex Data Processing component interfaces via SSH with port forwarding:

  1. Create an intermediate VM with a public IP address in the same network as the cluster and with a security group that allows incoming and outgoing traffic through the component ports.

  2. Connect to the created VM via SSH with a redirect to the appropriate ports of the Yandex Data Processing host. Depending on the image version, specify the username:

    • For version 2.0: ubuntu
    • For version 1.4: root

For more information about connecting to component interfaces of a Yandex Data Processing cluster, see Connecting to component interfaces.

What's nextWhat's next

  • Read about service concepts.
  • Learn more about creating clusters and working with jobs.

Was the article helpful?

Next
All guides
© 2025 Direct Cursus Technology L.L.C.