Yandex Cloud
Search
Contact UsGet started
  • Pricing
  • Customer Stories
  • Documentation
  • Blog
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • AI for business
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Start testing with double trial credits
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Center for Technologies and Society
    • Yandex Cloud Partner program
  • Pricing
  • Customer Stories
  • Documentation
  • Blog
© 2025 Direct Cursus Technology L.L.C.
Yandex Managed Service for Apache Spark™
  • Getting started
    • All guides
      • Spark jobs
      • PySpark jobs
      • SparkConnect jobs
  • Access management
  • Pricing policy
  • Yandex Monitoring metrics
  • Terraform reference
  • Release notes

In this article:

  • Creating a job
  • Cancel a job
  • Get a list of jobs
  • Get general info about a job
  • Get job execution logs
  1. Step-by-step guides
  2. Jobs
  3. Spark jobs

Managing Spark jobs

Written by
Yandex Cloud
Updated at October 20, 2025
  • Creating a job
  • Cancel a job
  • Get a list of jobs
  • Get general info about a job
  • Get job execution logs

Creating a jobCreating a job

Warning

Once created, the job will run automatically.

To create a job:

Management console
gRPC API
  1. Navigate to the folder dashboard and select Managed Service for Apache Spark™.

  2. Click the name of your cluster and open the Jobs tab.

  3. Click Create job.

  4. Enter the job name.

  5. In the Job type field, select Spark.

  6. In the Main jar field, specify the path to the application's main JAR file in the following format:

    File location Path format
    Instance file system file:///<file_path>
    Object Storage bucket s3a://<bucket_name>/<file_path>
    Internet http://<path_to_file> or https://<path_to_file>

    Archives in standard Linux formats, such as zip, gz, xz, bz2, etc., are supported.

    The cluster service account needs read access to all the files in the bucket. Step-by-step guides on how to set up access to Object Storage are provided in Editing a bucket ACL.

  7. In the Main class field, specify the name of the main application class.

  8. Specify job arguments.

    If an argument, variable, or property is in several space-separated parts, specify each part separately. At the same time, it is important to preserve the order in which you declare arguments, variables, and properties.

    The -n 1000 argument, for instance, must be converted into two arguments, -n and 1000, in that order.

  9. Optionally, specify the paths to JAR files, if any.

  10. Optionally, configure advanced settings:

    • Specify paths to the required files and archives.
    • In the Properties field, specify component properties as key-value pairs.
    • Specify the coordinates of included and excluded Maven packages as well as URLs of additional repositories for package search.
  11. Click Submit job.

  1. Get an IAM token for API authentication and save it as an environment variable:

    export IAM_TOKEN="<IAM_token>"
    
  2. Clone the cloudapi repository:

    cd ~/ && git clone --depth=1 https://github.com/yandex-cloud/cloudapi
    

    Below, we assume the repository contents are stored in the ~/cloudapi/ directory.

  3. Use the JobService.Create call and send the following request, e.g., via gRPCurl:

    grpcurl \
        -format json \
        -import-path ~/cloudapi/ \
        -import-path ~/cloudapi/third_party/googleapis/ \
        -proto ~/cloudapi/yandex/cloud/spark/v1/job_service.proto \
        -rpc-header "Authorization: Bearer $IAM_TOKEN" \
        -d '{
               "cluster_id": "<cluster_ID>",
               "name": "<job_name>",
               "spark_job": {
                 "args": [
                   <list_of_arguments>
                 ],
                 "jar_file_uris": [
                   <list_of_paths_to_JAR_files>
                 ],
                 "file_uris": [
                   <list_of_paths_to_files>
                 ],
                 "archive_uris": [
                   <list_of_paths_to_archives>
                 ],
                 "properties": {
                   <list_of_properties>
                 },
                 "main_jar_file_uri": "<path_to_main_JAR_file>",
                 "main_class": "<main_class_name>",
                 "packages": [
                   <list_of_package_Maven_coordinates>
                 ],
                 "repositories": [
                   <URLs_of_repositories_for_package_search>
                 ],
                 "exclude_packages": [
                   <list_of_Maven_coordinates_of_excluded_packages>
                 ]
               }
           }' \
        spark.api.cloud.yandex.net:443 \
        yandex.cloud.spark.v1.JobService.Create
    

    Where:

    • name: Spark job name.

    • spark_job: Spark job parameters:

      • args: Job arguments.

      • jar_file_uris: Paths to JAR files.

      • file_uris: Paths to files.

      • file_uris: Paths to archives.

      • properties: Component properties as key:value pairs.

      • main_jar_file_uri: Path to the application's main JAR file in the following format:

        File location Path format
        Instance file system file:///<file_path>
        Object Storage bucket s3a://<bucket_name>/<file_path>
        Internet http://<path_to_file> or https://<path_to_file>

        Archives in standard Linux formats, such as zip, gz, xz, bz2, etc., are supported.

        The cluster service account needs read access to all the files in the bucket. Step-by-step guides on how to set up access to Object Storage are provided in Editing a bucket ACL.

      • main_class: Main class name.

      • packages: Maven coordinates of the JAR files in groupId:artifactId:version format.

      • repositories: URLs of additional repositories for package search.

      • exclude_packages: Maven coordinates of the packages to exclude, in groupId:artifactId format.

    You can get the cluster ID with the list of clusters in the folder.

  4. View the server response to make sure your request was successful.

Cancel a jobCancel a job

Note

You cannot cancel jobs with the ERROR, DONE, or CANCELLED status. To find out a job's status, retrieve a list of jobs in the cluster.

Management console
CLI
gRPC API
  1. Navigate to the folder dashboard and select Managed Service for Apache Spark™.
  2. Click the name of your cluster and open the Jobs tab.
  3. Click the job name.
  4. Click Cancel in the top-right corner of the page.
  5. In the window that opens, select Cancel job.

If you do not have the Yandex Cloud CLI installed yet, install and initialize it.

By default, the CLI uses the folder specified when creating the profile. To change the default folder, use the yc config set folder-id <folder_ID> command. You can also set a different folder for any specific command using the --folder-name or --folder-id parameter.

To cancel a job, do the following:

  1. View the description of the CLI command for canceling a job:

    yc managed-spark job cancel --help
    
  2. Cancel a job by running this command:

    yc managed-spark job cancel <job_name_or_ID> \
      --cluster-id <cluster_ID>
    

    You can get the cluster ID with the list of clusters in the folder.

    You can get the job name and ID with the list of cluster jobs.

  1. Get an IAM token for API authentication and save it as an environment variable:

    export IAM_TOKEN="<IAM_token>"
    
  2. Clone the cloudapi repository:

    cd ~/ && git clone --depth=1 https://github.com/yandex-cloud/cloudapi
    

    Below, we assume the repository contents are stored in the ~/cloudapi/ directory.

  3. Use the JobService.Cancel call and send the following request, e.g., via gRPCurl:

    grpcurl \
        -format json \
        -import-path ~/cloudapi/ \
        -import-path ~/cloudapi/third_party/googleapis/ \
        -proto ~/cloudapi/yandex/cloud/spark/v1/job_service.proto \
        -rpc-header "Authorization: Bearer $IAM_TOKEN" \
        -d '{
               "cluster_id": "<cluster_ID>",
               "job_id": "<job_ID>"
           }' \
        spark.api.cloud.yandex.net:443 \
        yandex.cloud.spark.v1.JobService.Cancel
    

    You can get the cluster ID with the list of folder clusters, and the job ID, with the list of cluster jobs.

  4. View the server response to make sure your request was successful.

Get a list of jobsGet a list of jobs

Management console
CLI
gRPC API
  1. Navigate to the folder dashboard and select Managed Service for Apache Spark™.
  2. Click the name of your cluster and open the Jobs tab.

If you do not have the Yandex Cloud CLI installed yet, install and initialize it.

By default, the CLI uses the folder specified when creating the profile. To change the default folder, use the yc config set folder-id <folder_ID> command. You can also set a different folder for any specific command using the --folder-name or --folder-id parameter.

To get a list of cluster jobs:

  1. See the description of the CLI command for getting a list of jobs:

    yc managed-spark job list --help
    
  2. Get the list of jobs by running this command:

    yc managed-spark job list \
      --cluster-id <cluster_ID>
    

    You can get the cluster ID with the list of clusters in the folder.

  1. Get an IAM token for API authentication and save it as an environment variable:

    export IAM_TOKEN="<IAM_token>"
    
  2. Clone the cloudapi repository:

    cd ~/ && git clone --depth=1 https://github.com/yandex-cloud/cloudapi
    

    Below, we assume the repository contents are stored in the ~/cloudapi/ directory.

  3. Use the JobService.List call and send the following request, e.g., via gRPCurl:

    grpcurl \
        -format json \
        -import-path ~/cloudapi/ \
        -import-path ~/cloudapi/third_party/googleapis/ \
        -proto ~/cloudapi/yandex/cloud/spark/v1/job_service.proto \
        -rpc-header "Authorization: Bearer $IAM_TOKEN" \
        -d '{
               "cluster_id": "<cluster_ID>"
           }' \
        spark.api.cloud.yandex.net:443 \
        yandex.cloud.spark.v1.JobService.List
    

    You can get the cluster ID with the list of clusters in the folder.

  4. View the server response to make sure your request was successful.

Get general info about a jobGet general info about a job

Management console
CLI
gRPC API
  1. Navigate to the folder dashboard and select Managed Service for Apache Spark™.
  2. Click the name of your cluster and open the Jobs tab.
  3. Click the job name.

If you do not have the Yandex Cloud CLI installed yet, install and initialize it.

By default, the CLI uses the folder specified when creating the profile. To change the default folder, use the yc config set folder-id <folder_ID> command. You can also set a different folder for any specific command using the --folder-name or --folder-id parameter.

To get information about a job:

  1. View the description of the CLI command for getting information about a job:

    yc managed-spark job get --help
    
  2. Get information about the job by running this command:

    yc managed-spark job get <job_ID> \
      --cluster-id <cluster_ID>
    

    You can get the cluster ID with the list of clusters in the folder.

    You can get the job ID with the list of cluster jobs.

  1. Get an IAM token for API authentication and save it as an environment variable:

    export IAM_TOKEN="<IAM_token>"
    
  2. Clone the cloudapi repository:

    cd ~/ && git clone --depth=1 https://github.com/yandex-cloud/cloudapi
    

    Below, we assume the repository contents are stored in the ~/cloudapi/ directory.

  3. Use the JobService.Get call and send the following request, e.g., via gRPCurl:

    grpcurl \
        -format json \
        -import-path ~/cloudapi/ \
        -import-path ~/cloudapi/third_party/googleapis/ \
        -proto ~/cloudapi/yandex/cloud/spark/v1/job_service.proto \
        -rpc-header "Authorization: Bearer $IAM_TOKEN" \
        -d '{
               "cluster_id": "<cluster_ID>",
               "job_id": "<job_ID>"
           }' \
        spark.api.cloud.yandex.net:443 \
        yandex.cloud.spark.v1.JobService.Get
    

    You can get the cluster ID with the list of folder clusters, and the job ID, with the list of cluster jobs.

  4. View the server response to make sure your request was successful.

Get job execution logsGet job execution logs

Warning

To get job execution logs, enable logging in your cluster while creating it.

Management console
CLI
gRPC API
  1. Navigate to the folder dashboard and select Managed Service for Apache Spark™.
  2. Click the name of your cluster and open the Jobs tab.
  3. Click the job name.
  4. In the Output logs field, click the link.

If you do not have the Yandex Cloud CLI installed yet, install and initialize it.

By default, the CLI uses the folder specified when creating the profile. To change the default folder, use the yc config set folder-id <folder_ID> command. You can also set a different folder for any specific command using the --folder-name or --folder-id parameter.

To get job execution logs:

  1. See the description of the CLI command for getting job logs:

    yc managed-spark job log --help
    
  2. Get job logs by running this command:

    yc managed-spark job log <job_ID> \
      --cluster-id <cluster_ID>
    

    You can get the cluster ID with the list of clusters in the folder.

    You can get the job ID with the list of cluster jobs.

    To get logs for multiple jobs, list their IDs separated by spaces, e.g.:

    yc managed-spark job log c9q9veov4uql******** c9qu8uftedte******** \
      --cluster-id c9q8ml85r1oh********
    
  1. Get an IAM token for API authentication and save it as an environment variable:

    export IAM_TOKEN="<IAM_token>"
    
  2. Clone the cloudapi repository:

    cd ~/ && git clone --depth=1 https://github.com/yandex-cloud/cloudapi
    

    Below, we assume the repository contents are stored in the ~/cloudapi/ directory.

  3. Use the JobService.ListLog call and send the following request, e.g., via gRPCurl:

    grpcurl \
        -format json \
        -import-path ~/cloudapi/ \
        -import-path ~/cloudapi/third_party/googleapis/ \
        -proto ~/cloudapi/yandex/cloud/spark/v1/job_service.proto \
        -rpc-header "Authorization: Bearer $IAM_TOKEN" \
        -d '{
               "cluster_id": "<cluster_ID>",
               "job_id": "<job_ID>"
           }' \
        spark.api.cloud.yandex.net:443 \
        yandex.cloud.spark.v1.JobService.ListLog
    

    You can request the cluster ID with the list of clusters in the folder, and the job ID, with the list of cluster jobs.

  4. View the server response to make sure your request was successful.

Was the article helpful?

Previous
Deleting a cluster
Next
PySpark jobs
© 2025 Direct Cursus Technology L.L.C.