Working with Yandex Data Processing templates
Yandex Data Processing templates let you preset a cluster's configuration for your project and make it easier to deploy temporary clusters. You can find a list of templates on the project page under Project resources → Yandex Data Processing, the Shared tab.
To use Yandex Data Processing clusters, set the following project parameters:
-
Default folder to enable integration with other Yandex Cloud services. A Yandex Data Processing cluster will be deployed in this folder based on the current cloud quotas. A fee for using the cluster will be debited from your cloud billing account.
-
Service account to be used by DataSphere for creating and managing clusters. The service account needs the following roles:
dataproc.agent
to use Yandex Data Processing clusters.dataproc.admin
to create clusters from Yandex Data Processing templates.vpc.user
to use the Yandex Data Processing cluster network.iam.serviceAccounts.user
to create resources in the folder on behalf of the service account.
-
Subnet for DataSphere to communicate with the Yandex Data Processing cluster. Since the Yandex Data Processing cluster needs to access the internet, make sure to configure a NAT gateway in the subnet.
Note
If you specified a subnet in the project settings, the time to allocate computing resources may be increased.
Warning
The Yandex Data Processing persistent cluster must have the livy:livy.spark.deploy-mode : client
setting.
Creating a Yandex Data Processing template
-
Select the relevant project in your community or on the DataSphere homepage
in the Recent projects tab. -
Under Project resources, click
Yandex Data Processing. -
Click Create template.
-
In the Template name field, enter a name for the template. The naming requirements are as follows:
- The name must be from 3 to 63 characters long.
- It may contain lowercase Latin letters, numbers, and hyphens.
- The first character must be a letter and the last character cannot be a hyphen.
-
Click Create. This will display the created template's info page.
Activating a Yandex Data Processing template
-
Select the relevant project in your community or on the DataSphere homepage
in the Recent projects tab. - Under Project resources, click Yandex Data Processing.
- Click
next to the appropriate template and select Activate.
A cluster based on the activated Yandex Data Processing template is created when you run your project in the IDE.
Sharing a Yandex Data Processing template
-
Select the relevant project in your community or on the DataSphere homepage
in the Recent projects tab. - Under Project resources, click Yandex Data Processing.
- Select the appropriate template from the list.
- Go to the Access tab.
- Enable the visibility option next to the name of the community where you want to share the template.
To make a template available for use in another project, the project administrator should add it to the Shared tab.
Editing a template
You can only change the name of an existing template. To change the configuration, recreate the template.
-
Select the relevant project in your community or on the DataSphere homepage
in the Recent projects tab. - Under Project resources, click Yandex Data Processing.
- Select the relevant template in the list, click
, and select Edit. - Edit the name and click Save.
Deleting a Yandex Data Processing template
-
Select the relevant project in your community or on the DataSphere homepage
in the Recent projects tab. - Under Project resources, click Yandex Data Processing.
- In the list, select the template to delete.
- Click
and select Delete. - Click Confirm.
Warning
The actual deletion of resources can take up to 72 hours.