Working with Yandex Data Processing templates
Yandex Data Processing templates let you preset a cluster's configuration for your project and make it easier to deploy temporary clusters. You can find a list of templates on the project page under Project resources → Yandex Data Processing, the Shared tab.
To use Yandex Data Processing clusters, specify the following parameters in your project settings:
-
Default folder for integrating with other Yandex Cloud services. A Yandex Data Processing cluster will be deployed in this folder based on the current cloud quotas. A fee for using the cluster will be debited from your cloud billing account.
-
Service account DataSphere will use for creating and managing clusters. The service account needs the following roles:
dataproc.agent
to use Yandex Data Processing clusters.dataproc.admin
to create clusters from Yandex Data Processing templates.vpc.user
to use the Yandex Data Processing cluster network.iam.serviceAccounts.user
to create resources in the folder on behalf of the service account.
-
Subnet for DataSphere to communicate with the Yandex Data Processing cluster. Since the Yandex Data Processing cluster needs to access the internet, make sure to configure a NAT gateway in the subnet.
Note
If you specified a subnet in the project settings, the time to allocate computing resources may be increased.
Warning
The Yandex Data Processing persistent cluster must have the livy:livy.spark.deploy-mode : client
setting.
Creating a Yandex Data Processing template
-
Select the relevant project in your community or on the DataSphere homepage
in the Recent projects tab. -
Under Project resources, click
Yandex Data Processing. -
Click Create template.
-
In the Template name field, enter a name for the template. The naming requirements are as follows:
- The name must be from 2 to 63 characters long.
- It may contain lowercase Latin letters, numbers, and hyphens.
- The first character must be a letter and the last character cannot be a hyphen.
-
Click Create. You will see a page with detailed info on the template you created.
Activating a Yandex Data Processing template
-
Select the relevant project in your community or on the DataSphere homepage
in the Recent projects tab. - Under Project resources, click Yandex Data Processing.
- Click
next to the template you need and select Activate.
The system will create a cluster based on the activated Yandex Data Processing template when you run your project in the IDE.
Sharing a Yandex Data Processing template
-
Select the relevant project in your community or on the DataSphere homepage
in the Recent projects tab. - Under Project resources, click Yandex Data Processing.
- Select the template from the list.
- Go to the Access tab.
- Enable the visibility option next to the name of the community you want to share the template in.
To make a template available for use in a different project, the project admin needs to add it on the Shared tab.
Editing a template
You can only change the name of an existing template. To update the configuration, recreate the template.
-
Select the relevant project in your community or on the DataSphere homepage
in the Recent projects tab. - Under Project resources, click Yandex Data Processing.
- Select the template from the list, click
, and select Edit. - Edit the name and click Save.
Deleting a Yandex Data Processing template
-
Select the relevant project in your community or on the DataSphere homepage
in the Recent projects tab. -
Under Project resources, click
Yandex Data Processing. -
In the list, select the template you want to delete.
-
Click
and select Delete. -
Click Confirm.
Warning
The actual deletion of resources can take up to 72 hours.