Monitoring the state of Spark applications
To assess the performance of Spark applications in a Yandex Managed Service for Apache Spark™ cluster, you can check the following:
- Application details
- Detailed information about the stages
- Resources allocated to the application
- Persisted RDDs
- List of SQL queries and their plans
Checking application details
-
Open the folder dashboard
. -
Go to Managed Service for Apache Spark™.
-
Click the cluster name.
-
Under Advanced settings, select Spark History Server and click the link.
This will open the list of completed applications. To switch to the list of running applications, click Show incomplete applications at the bottom of the table.
-
Find the application in question and click the link in the App ID column.
This will open the Jobs tab in the Spark History Server window with detailed information about the selected application:
- Event Timeline: Job run history as a diagram. It displays markers for executor allocation and release and job statuses.
- Active Jobs: List of running or pending jobs.
- Completed Jobs: List of completed jobs.
For each job, the table specifies:
- Submitted time
- Duration
- Stages: Succeeded/Total
- Tasks (for all stages): Succeeded/Total
Checking stage details
-
Open the folder dashboard
. -
Go to Managed Service for Apache Spark™.
-
Click the cluster name.
-
Under Advanced settings, select Spark History Server and click the link. This will open the Spark History Server window.
-
In the top menu, navigate to Stages.
There you will find the Completed Stages table listing all the job stages. For each stage, it provides the following details:
- Submitted time
- Duration
- Tasks: Succeeded/Total
- Shuffle Read/Write size
- Input/Output data size
To get detailed information about a stage, click the text in the Description column. The Details for Stage page displays the following:
- DAG Visualization: Execution graph visualization.
- Event Timeline: Stage run history as a diagram with various indicators.
- Summary operation metrics:
- Duration: Run duration.
- GC Time: Garbage collection time.
- Input Size/Records: Size of input and written data.
- Aggregated Metrics by Executor: Metrics by executor.
- Tasks: Table with operation details.
Checking resources allocated to the application
-
Open the folder dashboard
. -
Go to Managed Service for Apache Spark™.
-
Click the cluster name.
-
Under Advanced settings, select Spark History Server and click the link. This will open the Spark History Server window.
-
In the top menu, navigate to Executors.
The UI displays two tables:
- Summary: High-level information, such as the number and status of executors and resources in use.
- Executors: Information about each executor.
The tables specify the following:
- Amount of resources available per executor.
- Number of running and completed tasks.
- Task duration (Task Time), including the time spent on garbage collection (GC Time).
Tip
If garbage collection takes much time:
- Make sure you have enough memory allocated to the executor.
- Configure the garbage collector manually. To learn how to do this, see the Apache Spark documentation
.
Checking persisted RDDs
-
Open the folder dashboard
. -
Go to Managed Service for Apache Spark™.
-
Click the cluster name.
-
Under Advanced settings, select Spark History Server and click the link. This will open the Spark History Server window.
-
In the top menu, navigate to Storage.
The UI displays the list of persisted resilient distributed datasets (RDDs
). For each RDD, it shows memory consumption, disk usage, and caching progress.To view details, click the RDD name.
Checking the list of SQL queries and their plans
-
Open the folder dashboard
. -
Go to Managed Service for Apache Spark™.
-
Click the cluster name.
-
Under Advanced settings, select Spark History Server and click the link. This will open the Spark History Server window.
-
In the top menu, navigate to SQL/DataFrame.
The table lists executed SQL queries, including their start time and duration.
To see the query plan, click the query text in the Description column. The query plan is displayed as a flowchart. To view it as text, click Details at the bottom of the figure.
The query plan contains stats for each operator along with the number of completed tasks and their duration. If the query is still running, the current stats will be shown.