Yandex Cloud
Search
Contact UsGet started
  • Pricing
  • Customer Stories
  • Documentation
  • Blog
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • AI for business
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Start testing with double trial credits
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Center for Technologies and Society
    • Yandex Cloud Partner program
  • Pricing
  • Customer Stories
  • Documentation
  • Blog
© 2025 Direct Cursus Technology L.L.C.
Yandex DataSphere
  • Getting started
    • Setting up a project to work with a cloud
    • Setting up a student community
    • Automating project setup
    • Integrating Yandex Data Processing
  • Terraform reference
  • Audit Trails events
  • Access management
  • Pricing policy
  • Public materials
  • Release notes

In this article:

  • Getting started
  • Create a project
  • Create a cloud and a folder
  • Configure your network
  • Create a service account
  • Service integration examples
  • Computing on Apache Spark™ clusters
  • Deploying a pretrained model as a service
  1. Organizing a workflow in DataSphere
  2. Setting up a project to work with a cloud

Setting up a project to work with a cloud in Yandex Cloud

Written by
Yandex Cloud
Updated at October 24, 2025
  • Getting started
  • Create a project
  • Create a cloud and a folder
  • Configure your network
  • Create a service account
  • Service integration examples
    • Computing on Apache Spark™ clusters
    • Deploying a pretrained model as a service

Yandex DataSphere provides everything you need for data analysis and ML model training. However, if you want to use all Yandex Cloud features, you will need to set up a DataSphere project to work with the cloud in Yandex Cloud and enable integration with other platform services.

This guide describes how to arrange a workspace in DataSphere to effectively use Yandex Cloud services.

  1. Create a project.
  2. Create a cloud and a folder.
  3. Configure your network.
  4. Create a service account.
  5. Service integration examples.

For detailed information on how to create and set up resources, see the Step-by-step guides section in the documentation for respective services.

Getting startedGetting started

Before getting started, register in Yandex Cloud, set up a community, and link your billing account to it.

  1. On the DataSphere home page, click Try for free and select an account to log in with: Yandex ID or your working account with the identity federation (SSO).
  2. Select the Yandex Identity Hub organization you are going to use in Yandex Cloud.
  3. Create a community.
  4. Link your billing account to the DataSphere community you are going to work in. Make sure you have a linked billing account and its status is ACTIVE or TRIAL_ACTIVE. If you do not have a billing account yet, create one in the DataSphere interface.

Create a projectCreate a project

DataSphere communities group users into a team and allow them to share resources and manage budgets. A project within a community is a user's individual workspace run on Yandex Cloud VMs. Depending on the operation mode, a project may include one or more VMs with each VM assigned to a separate notebook within the project.

Note

DataSphere is not designed for pair programming. In Dedicated mode, multiple users can collaborate within a single project if each user is working in a separate notebook.

Create a DataSphere project as described in this guide.

Next, you can specify parameters for integration with other Yandex Cloud services on the project edit page.

Create a cloud and a folderCreate a cloud and a folder

Most Yandex Cloud services run inside cloud folders. To access cloud resources, use Yandex Cloud Console, the Yandex Cloud management console.

Log in to the management console and create your first cloud and folder to host services you want to use from DataSphere.

Learn more about user interaction with resources in Yandex Cloud.

Tip

You can use multiple folders to set up granular access and distinguish between runtime environments and tasks.

Configure your networkConfigure your network

To enable Yandex Cloud service resources to exchange information, create a cloud network and subnet. By default, a network is isolated within Yandex Cloud and has no access to the internet. To enable your cloud resources to access the internet without using public IP addresses, create and set up a NAT gateway.

Note

By default, DataSphere projects use a service subnet with access to the internet. If you specify your own subnet with no NAT gateway configured in the project settings, you will not be able to update installed packages and perform other network operations.

Create a service accountCreate a service account

Yandex Cloud has a special type of account to automate operations: a service account. Via a service account, software can manage service resources. A service account can perform operations on resources only if it has appropriate roles. Learn more about the current service roles in the Access management section of the documentation.

In DataSphere, you can enable a service account to perform operations using these two methods:

  1. If a service account needs to perform operations on resources of other services on behalf of DataSphere, add it to project settings.
  2. If a service account needs to perform operations on a project or community in DataSphere (run cells, create resources, etc.), add it to the list of project members or community members with the respective role.

Service integration examplesService integration examples

Check our examples of setting up a project for a variety of tasks in DataSphere and setting up integration with Yandex Cloud services.

Computing on Apache Spark™ clustersComputing on Apache Spark™ clusters

DataSphere allows you to run computations on Apache Spark™ clusters created in Yandex Data Processing.

To work with Yandex Data Processing clusters:

  1. In the project settings, specify these parameters:

    • Default folder for integrating with other Yandex Cloud services. It will house a Yandex Data Processing cluster based on the current cloud quotas. A fee for using the cluster will be debited from your cloud billing account.
    • Service account with the vpc.user role. DataSphere will use for this account to work with the Yandex Data Processing cluster network.
    • Subnet for DataSphere to communicate with the Yandex Data Processing cluster. Since the Yandex Data Processing cluster needs to access the internet, make sure to configure a NAT gateway in this subnet. After you specify a subnet, the time for computing resource allocation may increase.
  2. Create a service agent:

    1. To allow a service agent to operate in DataSphere, ask your cloud admin or owner to run the following command in the Yandex Cloud CLI:

      yc iam service-control enable datasphere --cloud-id <cloud_ID>
      

      Where --cloud-id is the ID of the cloud you are going to use in the DataSphere community.

    2. Create a service account with the following roles:

      • dataproc.agent to use Yandex Data Processing clusters.
      • dataproc.admin to create clusters from Yandex Data Processing templates.
      • vpc.user to use the Yandex Data Processing cluster network.
      • iam.serviceAccounts.user to create resources in the folder on behalf of the service account.
    3. Under Spark clusters in the community settings, click Add service account and select the service account you created.

Warning

The Yandex Data Processing persistent cluster must have the livy:livy.spark.deploy-mode : client setting.

Learn more about working with Yandex Data Processing clusters in DataSphere:

  • Ways to use Apache Spark™ clusters in DataSphere.
  • Integration with Yandex Data Processing.

Deploying a pretrained model as a serviceDeploying a pretrained model as a service

If you want to deploy a model as a separate service in DataSphere, use nodes based on a Docker image. In the project settings, specify the following parameters:

  • Default folder to store node logs.
  • Service account with the following permissions:
    • container-registry.images.puller to allow DataSphere to pull your Docker image to create a node.
    • vpc.user to use the DataSphere network.
    • (Optional) datasphere.user to send requests to the node.

Learn more about deploying services in DataSphere:

  • DataSphere Inference.
  • Deploying a service based on a Docker image.

Was the article helpful?

Previous
Special terms for educational institutions
Next
Setting up a student community
© 2025 Direct Cursus Technology L.L.C.