Apache Spark™ cluster maintenance
You can manage Apache Spark™ cluster maintenance, including the following:
- Getting a list of maintenance jobs
- Getting cluster maintenance logs
- Postponing scheduled maintenance
- Starting scheduled maintenance immediately
- Configuring a maintenance window
Getting a list of maintenance jobs
-
Open the folder dashboard
. -
Go to Managed Service for Apache Spark™.
-
Click the name of your cluster and select the Technical maintenance tab.
To view maintenance jobs with a specific status, click Status above the maintenance list and select the status you want from the drop-down menu. To find a specific maintenance job, enter its ID or task name in the field above the list of maintenance sessions.
Getting cluster maintenance logs
- Open the folder dashboard
. - Go to Managed Service for Apache Spark™.
- Click the name of your cluster and select the Technical maintenance tab.
- Click the ID of the maintenance job you need.
- Click Task logs.
Moving scheduled maintenance
Maintenance with the Planned status is scheduled for the date and time specified in the Start date column. You can reschedule such maintenance to a new date and time if needed.
To reschedule maintenance for a new date and time:
- Open the folder dashboard
. - Go to Managed Service for Apache Spark™.
- Click the name of your cluster and select the Technical maintenance tab.
- Click
next to the maintenance with the Planned status. - In the drop-down menu, select
Postpone. - In the window that opens:
- To postpone the maintenance until the next available window, click Next window and then Reschedule.
- To move a maintenance forward or backward to a specific UTC date and time, click Choose date (UTC), then select a new date and time and click Reschedule.
Starting scheduled maintenance immediately
If you need to, you can perform a maintenance with the Planned status immediately without waiting for the time specified in the Start date column.
To run a scheduled cluster maintenance job immediately:
- Open the folder dashboard
. - Go to Managed Service for Apache Spark™.
- Click the name of your cluster and select the Technical maintenance tab.
- Click
next to the maintenance. - In the drop-down menu, select
Carry out now.
Configuring a maintenance window
By default, maintenance can be scheduled for any time. You can choose a specific day of the week and hour to schedule maintenance. For example, you can choose the time when the cluster is least busy.
Warning
A scheduled maintenance job will be canceled automatically if it falls outside the specified interval.
- Open the folder dashboard
. - Go to Managed Service for Apache Spark™.
- Click the name of your cluster and select the Technical maintenance tab.
- Click
Configure the maintenance window. - In the window that opens:
- To allow maintenance at any time, select arbitrary, which is also the default option.
- To allow weekly maintenance at a specific time, select by schedule and specify the weekday and hour in UTC.
If you do not have the Yandex Cloud CLI installed yet, install and initialize it.
By default, the CLI uses the folder specified when creating the profile. To change the default folder, use the yc config set folder-id <folder_ID> command. You can also set a different folder for any specific command using the --folder-name or --folder-id options.
-
See the description of the CLI command for updating the maintenance window:
yc managed-spark cluster update --help -
Run this command:
yc managed-spark cluster update <cluster_name_or_ID> \ --maintenance-window type=<maintenance_type>,` `day=<day_of_week>,` `hour=<hour>Where
typeis the maintenance type:anytime: At any time (default).weekly: On a schedule. For this value, also specify the following:day: Day of week, i.e.,MON,TUE,WED,THU,FRI,SAT, orSUN.hour: Hour of day (UTC), from1to24.
You can get the cluster name or ID with the list of clusters in the folder.
-
Open the current Terraform configuration file describing your infrastructure.
To learn how to create this file, see Creating a cluster.
For a complete list of configurable Apache Spark™ cluster fields, see this Terraform provider guide.
-
To set up the maintenance window that will also apply to stopped clusters, add the
maintenance_windowsection to the cluster description:resource "yandex_spark_cluster" "<cluster_name>" { ... maintenance_window { type = <maintenance_type> day = <day_of_week> hour = <hour> } ... }Where:
type: Maintenance type. The possible values include:ANYTIME: AnytimeWEEKLY: On a schedule
day: Day of week for theWEEKLYtype, i.e.,MON,TUE,WED,THU,FRI,SAT, orSUN.hour: UTC hour for theWEEKLYtype, from1to24.
-
Make sure the settings are correct.
-
In the command line, navigate to the directory that contains the current Terraform configuration files defining the infrastructure.
-
Run this command:
terraform validateTerraform will show any errors found in your configuration files.
-
-
Confirm resource changes.
-
Run this command to view the planned changes:
terraform planIf you described the configuration correctly, the terminal will display a list of the resources to update and their parameters. This is a verification step that does not apply changes to your resources.
-
If everything looks correct, apply the changes:
-
Run this command:
terraform apply -
Confirm updating the resources.
-
Wait for the operation to complete.
-
-
-
Call the ClusterService.Update method, e.g., via the following gRPCurl
request. -
Get an IAM token for API authentication and place it in an environment variable:
export IAM_TOKEN="<IAM_token>" -
Clone the cloudapi
repository:cd ~/ && git clone --depth=1 https://github.com/yandex-cloud/cloudapiBelow, we assume that the repository contents reside in the
~/cloudapi/directory. -
Call the ClusterService.Update method, e.g., via the following gRPCurl
request:Warning
The API method will assign default values to all the parameters of the object you are modifying unless you explicitly provide them in your request. To avoid this, list the settings you want to change in the
update_maskparameter as an array ofpaths[]strings.Format for listing settings
"update_mask": { "paths": [ "<setting_1>", "<setting_2>", ... "<setting_N>" ] }grpcurl \ -format json \ -import-path ~/cloudapi/ \ -import-path ~/cloudapi/third_party/googleapis/ \ -proto ~/cloudapi/yandex/cloud/spark/v1/cluster_service.proto \ -rpc-header "Authorization: Bearer $IAM_TOKEN" \ -d '{ "cluster_id": "<cluster_ID>", "update_mask": { "paths": ["maintenance_window"] }, "maintenance_window": { "weekly_maintenance_window": { "day": "<day_of_week>", "hour": "<hour>" } } }' \ spark.api.cloud.yandex.net:443 \ yandex.cloud.spark.v1.ClusterService.UpdateWhere:
-
update_mask: List of settings you want to update as an array of strings (paths[]).Here, we provide only one setting.
-
maintenance_window: Maintenance window settings, including for stopped clusters. Inmaintenance_window, provide one of the following values:-
anytime: Any time. -
weekly_maintenance_window: Once a week on the specified day and time:day: Day of week inDDDformat, i.e.,MON,TUE,WED,THU,FRI,SAT, orSUN.hour: Time of day (UTC) inHHformat, from1to24.
-
You can get the cluster ID with the list of clusters in the folder.
-
-
Check the server response to make sure your request was successful.