Yandex Cloud
Search
Contact UsTry it for free
  • Customer Stories
  • Documentation
  • Blog
  • All Services
  • System Status
  • Marketplace
    • Featured
    • Infrastructure & Network
    • Data Platform
    • AI for business
    • Security
    • DevOps tools
    • Serverless
    • Monitoring & Resources
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Start testing with double trial credits
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Center for Technologies and Society
    • Yandex Cloud Partner program
    • Price calculator
    • Pricing plans
  • Customer Stories
  • Documentation
  • Blog
© 2026 Direct Cursus Technology L.L.C.
Yandex Managed Service for Apache Spark™
  • Getting started
    • All guides
      • Cluster state monitoring
      • Viewing cluster logs
      • Monitoring the state of Spark applications
      • Diagnostics and examples of troubleshooting performance issues in Spark applications
  • Access management
  • Pricing policy
  • Yandex Monitoring metrics
  • Audit Trails events
  • Terraform reference
  • Release notes

In this article:

  • Checking application details
  • Checking stage details
  • Checking resources allocated to the application
  • Checking persisted RDDs
  • Checking the list of SQL queries and their plans
  1. Step-by-step guides
  2. Logs and monitoring
  3. Monitoring the state of Spark applications

Monitoring the state of Spark applications

Written by
Yandex Cloud
Updated at March 19, 2026
  • Checking application details
  • Checking stage details
  • Checking resources allocated to the application
  • Checking persisted RDDs
  • Checking the list of SQL queries and their plans

To assess the performance of Spark applications in a Yandex Managed Service for Apache Spark™ cluster, you can check the following:

  • Application details
  • Detailed information about the stages
  • Resources allocated to the application
  • Persisted RDDs
  • List of SQL queries and their plans

Checking application detailsChecking application details

  1. Open the folder dashboard.

  2. Go to Managed Service for Apache Spark™.

  3. Click the cluster name.

  4. Under Advanced settings, select Spark History Server and click the link.

    This will open the list of completed applications. To switch to the list of running applications, click Show incomplete applications at the bottom of the table.

  5. Find the application in question and click the link in the App ID column.

    This will open the Jobs tab in the Spark History Server window with detailed information about the selected application:

    • Event Timeline: Job run history as a diagram. It displays markers for executor allocation and release and job statuses.
    • Active Jobs: List of running or pending jobs.
    • Completed Jobs: List of completed jobs.

    For each job, the table specifies:

    • Submitted time
    • Duration
    • Stages: Succeeded/Total
    • Tasks (for all stages): Succeeded/Total

Checking stage detailsChecking stage details

  1. Open the folder dashboard.

  2. Go to Managed Service for Apache Spark™.

  3. Click the cluster name.

  4. Under Advanced settings, select Spark History Server and click the link. This will open the Spark History Server window.

  5. In the top menu, navigate to Stages.

    There you will find the Completed Stages table listing all the job stages. For each stage, it provides the following details:

    • Submitted time
    • Duration
    • Tasks: Succeeded/Total
    • Shuffle Read/Write size
    • Input/Output data size

    To get detailed information about a stage, click the text in the Description column. The Details for Stage page displays the following:

    • DAG Visualization: Execution graph visualization.
    • Event Timeline: Stage run history as a diagram with various indicators.
    • Summary operation metrics:
      • Duration: Run duration.
      • GC Time: Garbage collection time.
      • Input Size/Records: Size of input and written data.
    • Aggregated Metrics by Executor: Metrics by executor.
    • Tasks: Table with operation details.

Checking resources allocated to the applicationChecking resources allocated to the application

  1. Open the folder dashboard.

  2. Go to Managed Service for Apache Spark™.

  3. Click the cluster name.

  4. Under Advanced settings, select Spark History Server and click the link. This will open the Spark History Server window.

  5. In the top menu, navigate to Executors.

    The UI displays two tables:

    • Summary: High-level information, such as the number and status of executors and resources in use.
    • Executors: Information about each executor.

    The tables specify the following:

    • Amount of resources available per executor.
    • Number of running and completed tasks.
    • Task duration (Task Time), including the time spent on garbage collection (GC Time).

    Tip

    If garbage collection takes much time:

    • Make sure you have enough memory allocated to the executor.
    • Configure the garbage collector manually. To learn how to do this, see the Apache Spark documentation.

Checking persisted RDDsChecking persisted RDDs

  1. Open the folder dashboard.

  2. Go to Managed Service for Apache Spark™.

  3. Click the cluster name.

  4. Under Advanced settings, select Spark History Server and click the link. This will open the Spark History Server window.

  5. In the top menu, navigate to Storage.

    The UI displays the list of persisted resilient distributed datasets (RDDs). For each RDD, it shows memory consumption, disk usage, and caching progress.

    To view details, click the RDD name.

Checking the list of SQL queries and their plansChecking the list of SQL queries and their plans

  1. Open the folder dashboard.

  2. Go to Managed Service for Apache Spark™.

  3. Click the cluster name.

  4. Under Advanced settings, select Spark History Server and click the link. This will open the Spark History Server window.

  5. In the top menu, navigate to SQL/DataFrame.

    The table lists executed SQL queries, including their start time and duration.

    To see the query plan, click the query text in the Description column. The query plan is displayed as a flowchart. To view it as text, click Details at the bottom of the figure.

    The query plan contains stats for each operator along with the number of completed tasks and their duration. If the query is still running, the current stats will be shown.

Was the article helpful?

Previous
Viewing cluster logs
Next
Diagnostics and examples of troubleshooting performance issues in Spark applications
© 2026 Direct Cursus Technology L.L.C.