Yandex Cloud
Search
Contact UsGet started
  • Pricing
  • Customer Stories
  • Documentation
  • Blog
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • AI for business
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Start testing with double trial credits
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Center for Technologies and Society
    • Yandex Cloud Partner program
  • Pricing
  • Customer Stories
  • Documentation
  • Blog
© 2025 Direct Cursus Technology L.L.C.
Yandex DataSphere
  • Getting started
    • About Yandex DataSphere
    • DataSphere resource relationships
    • Communities
    • Cost management
    • Project
    • Computing resource configurations
      • Overview
      • Secrets
      • Docker images
      • Datasets
      • Yandex Data Processing templates
      • S3 connectors
      • Spark connectors
      • Models
      • File storages
    • Foundation models
    • Quotas and limits
    • Special terms for educational institutions
  • Terraform reference
  • Audit Trails events
  • Access management
  • Pricing policy
  • Public materials
  • Release notes

In this article:

  • Information about Yandex Data Processing templates as a resource
  • Specifics of a temporary cluster based on a Yandex Data Processing template
  • Configurations of temporary clusters
  • Statuses of temporary Yandex Data Processing clusters
  1. Concepts
  2. Resources
  3. Yandex Data Processing templates

Yandex Data Processing templates

Written by
Yandex Cloud
Updated at October 24, 2025
  • Information about Yandex Data Processing templates as a resource
  • Specifics of a temporary cluster based on a Yandex Data Processing template
    • Configurations of temporary clusters
    • Statuses of temporary Yandex Data Processing clusters

A Yandex Data Processing template is a special resource for rapid deployment of Yandex Data Processing clusters in DataSphere projects. Templates define cluster configuration and can be used by DataSphere to deploy the cluster multiple times.

To work with Yandex Data Processing clusters:

  1. In the project settings, specify these parameters:

    • Default folder for integrating with other Yandex Cloud services. It will house a Yandex Data Processing cluster based on the current cloud quotas. A fee for using the cluster will be debited from your cloud billing account.
    • Service account with the vpc.user role. DataSphere will use for this account to work with the Yandex Data Processing cluster network.
    • Subnet for DataSphere to communicate with the Yandex Data Processing cluster. Since the Yandex Data Processing cluster needs to access the internet, make sure to configure a NAT gateway in this subnet. After you specify a subnet, the time for computing resource allocation may increase.
  2. Create a service agent:

    1. To allow a service agent to operate in DataSphere, ask your cloud admin or owner to run the following command in the Yandex Cloud CLI:

      yc iam service-control enable datasphere --cloud-id <cloud_ID>
      

      Where --cloud-id is the ID of the cloud you are going to use in the DataSphere community.

    2. Create a service account with the following roles:

      • dataproc.agent to use Yandex Data Processing clusters.
      • dataproc.admin to create clusters from Yandex Data Processing templates.
      • vpc.user to use the Yandex Data Processing cluster network.
      • iam.serviceAccounts.user to create resources in the folder on behalf of the service account.
    3. Under Spark clusters in the community settings, click Add service account and select the service account you created.

Warning

The Yandex Data Processing persistent cluster must have the livy:livy.spark.deploy-mode : client setting.

Information about Yandex Data Processing templates as a resourceInformation about Yandex Data Processing templates as a resource

The following information is stored about each template:

  • Resource name.
  • Resource creator.
  • Cluster configuration.
  • Template creation date in UTC format, such as July 18, 2022, 14:23.

You can view all Yandex Data Processing templates created in your project on the Yandex Data Processing resource page. It also provides a list of all Yandex Data Processing clusters available in the project. It contains both temporary clusters based on Yandex Data Processing templates and connected clusters deployed in Yandex Data Processing. To view detailed information about a template or cluster, click it.

Specifics of a temporary cluster based on a Yandex Data Processing templateSpecifics of a temporary cluster based on a Yandex Data Processing template

To create a cluster from a Yandex Data Processing template, activate the template in your project. When running a project in the IDE, DataSphere creates a temporary cluster in the Yandex Cloud folder and subnet specified in the project settings.

DataSphere tracks the cluster's lifetime and automatically deletes it if no computations have been performed on it within two hours. The cluster will also be deleted if you force stop the computations running in the project.

Configurations of temporary clustersConfigurations of temporary clusters

Automated Yandex Data Processing clusters are deployed on Yandex Compute Cloud VMs powered by Intel Cascade Lake (standard-v2).

You can calculate the total disk storage capacity required for different cluster configurations using this formula:

<number_of_Yandex_Data_Processing_hosts> × 256 + 128
Cluster type Number of hosts Disk size Host parameters
XS 1 384 GB HDD 4 vCPUs, 16 GB RAM
S 4 1152 GB SSD 4 vCPUs, 16 GB RAM
M 8 2176 GB SSD 16 vCPUs, 64 GB RAM
L 16 4224 GB SSD 16 vCPUs, 64 GB RAM
XL 32 8320 GB SSD 16 vCPUs, 64 GB RAM

Tip

Before running a project with an activated Yandex Data Processing template, make sure the quotas for creating HDDs or SSDs allow you to create a disk of a sufficient size.

You will be charged extra for using temporary clusters created based on Yandex Data Processing templates according to the Yandex Data Processing pricing policy.

Statuses of temporary Yandex Data Processing clustersStatuses of temporary Yandex Data Processing clusters

DataSphere creates a temporary Yandex Data Processing cluster once you open your project in the IDE.

The created cluster appears in the list of available clusters on the Yandex Data Processing resource page. A temporary cluster can have one of the following statuses:

  • STARTING: The cluster is being created.
  • UP: The cluster has been created and is ready to run calculations.
  • DOWN: There have been issues while creating the cluster.

See alsoSee also

  • How to create, activate, copy, and delete a template

Was the article helpful?

Previous
Datasets
Next
S3 connectors
© 2025 Direct Cursus Technology L.L.C.