Yandex Cloud
Search
Discuss with expertTry it for free
  • Customer Stories
  • Documentation
  • Blog
  • All Services
  • System Status
  • Marketplace
    • Featured
    • Infrastructure & Network
    • Data Platform
    • AI for business
    • Security
    • DevOps tools
    • Serverless
    • Monitoring & Resources
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Start testing with double trial credits
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Center for Technologies and Society
    • Yandex Cloud Partner program
    • Price calculator
    • Pricing plans
  • Customer Stories
  • Documentation
  • Blog
© 2026 Direct Cursus Technology L.L.C.
Yandex Data Processing
  • Getting started
    • All guides
      • All jobs
      • Running jobs
      • Spark jobs
      • PySpark jobs
      • Hive jobs
      • MapReduce jobs
    • Creating and using Python virtual environments
  • Access management
  • Pricing policy
  • Terraform reference
  • Monitoring metrics
  • Audit Trails events
  • Public materials
  • FAQ

In this article:

  • Creating a job
  • Cancel a job
  • Getting a list of jobs
  • Getting general info about a job
  • Getting job execution logs
  1. Step-by-step guides
  2. Jobs
  3. Hive jobs

Managing Hive jobs

Written by
Yandex Cloud
Updated at June 29, 2026
  • Creating a job
  • Cancel a job
  • Getting a list of jobs
  • Getting general info about a job
  • Getting job execution logs

In a Yandex Data Processing cluster, you can manage jobs and receive execution logs for them. For examples of jobs, see Working with jobs.

Creating a jobCreating a job

Management console
CLI
API
  1. Open the folder dashboard.

  2. Navigate to Yandex Data Processing.

  3. Click the name of your cluster and select the Jobs tab.

  4. Click Submit job.

  5. Optionally, enter a name for the job.

  6. In the Job type field, select Hive.

  7. Optionally, in the Properties field, specify component properties as key-value pairs.

    If an argument, variable, or property is in several space-separated parts, specify each part separately. At the same time, it is important to preserve the order in which you declare arguments, variables, and properties.

    The -mapper mapper.py argument, for instance, must be converted into two arguments, -mapper and mapper.py, in that order.

  8. Optionally, enable Continue on failure.

  9. Specify Script variables as a key-value pair.

  10. Optionally, specify the paths to JAR files, if any.

    File location Path format
    Instance file system file:///<path_to_file>
    Distributed cluster file system hdfs:///<path_to_file>
    Object Storage bucket s3a://<bucket_name>/<path_to_file>
    Internet http://<path_to_file> or https://<path_to_file>

    Archives in standard Linux formats, such as zip, gz, xz, bz2, etc., are supported.

    The cluster service account needs read access to all the files in the bucket. Step-by-step guides on how to set up access to Object Storage are provided in Editing a bucket ACL.

  11. Select one of the driver types and specify which to use to run the job:

    • List of queries to run.
    • Path to the file with the queries to run.
  12. Click Submit job.

If you do not have the Yandex Cloud CLI yet, install and initialize it.

The folder used by default is the one specified when creating the CLI profile. To change the default folder, use the yc config set folder-id <folder_ID> command. You can also specify a different folder for any command using --folder-name or --folder-id. If you access a resource by its name, the search will be limited to the default folder. If you access a resource by its ID, the search will be global, i.e., through all folders based on access permissions.

To create a job:

  1. See the description of the CLI command for creating Hive jobs:

    yc dataproc job create-hive --help
    
  2. Create a job (the example does not illustrate all available parameters):

    yc dataproc job create-hive \
       --cluster-name=<cluster_name> \
       --name=<job_name> \
       --query-file-uri=<query_file_URI> \
       --script-variables=<list_of_values>
    

    Where --script-variables is the comma-separated list of variable values.

    Provide the paths to the files required for the job in the following format:

    File location Path format
    Instance file system file:///<path_to_file>
    Distributed cluster file system hdfs:///<path_to_file>
    Object Storage bucket s3a://<bucket_name>/<path_to_file>
    Internet http://<path_to_file> or https://<path_to_file>

    Archives in standard Linux formats, such as zip, gz, xz, bz2, etc., are supported.

    The cluster service account needs read access to all the files in the bucket. Step-by-step guides on how to set up access to Object Storage are provided in Editing a bucket ACL.

You can get the cluster ID and name with the list of clusters in the folder.

Call the create API method and provide the following in the request:

  • Cluster ID in the clusterId parameter. You can get it with the list of clusters in the folder.
  • Job name in the name parameter.
  • Job properties in the hiveJob parameter.

Cancel a jobCancel a job

Note

You cannot cancel jobs with the ERROR, DONE, or CANCELLED status. To find out the job status, get the list of jobs in the cluster.

Management console
CLI
API
  1. Open the folder dashboard.
  2. Navigate to Yandex Data Processing.
  3. Click the name of your cluster and select the Jobs tab.
  4. Click the job name.
  5. Click Cancel in the top-right corner of the page.
  6. In the window that opens, select Cancel.

If you do not have the Yandex Cloud CLI yet, install and initialize it.

The folder used by default is the one specified when creating the CLI profile. To change the default folder, use the yc config set folder-id <folder_ID> command. You can also specify a different folder for any command using --folder-name or --folder-id. If you access a resource by its name, the search will be limited to the default folder. If you access a resource by its ID, the search will be global, i.e., through all folders based on access permissions.

To cancel a job, run this command:

yc dataproc job cancel <job_name_or_ID> \
  --cluster-name=<cluster_name>

You can get the job ID and name with the list of jobs in the cluster, and the cluster name, with the list of clusters in the folder.

Call the cancel API method and provide the following in the request:

  • Cluster ID in the clusterId parameter.
  • Job ID in the jobId parameter.

You can get the cluster ID with the list of clusters in the folder, and the job ID, with the list of cluster jobs.

Getting a list of jobsGetting a list of jobs

Management console
CLI
API
  1. Open the folder dashboard.
  2. Navigate to Yandex Data Processing.
  3. Click the name of your cluster and select the Jobs tab.

If you do not have the Yandex Cloud CLI yet, install and initialize it.

The folder used by default is the one specified when creating the CLI profile. To change the default folder, use the yc config set folder-id <folder_ID> command. You can also specify a different folder for any command using --folder-name or --folder-id. If you access a resource by its name, the search will be limited to the default folder. If you access a resource by its ID, the search will be global, i.e., through all folders based on access permissions.

To get a list of jobs, run the following command:

yc dataproc job list --cluster-name=<cluster_name>

You can get the cluster ID and name with the list of clusters in the folder.

Call the list API method, providing the cluster ID in the clusterId request parameter.

You can get the cluster ID with the list of clusters in the folder.

Getting general info about a jobGetting general info about a job

Management console
CLI
API
  1. Open the folder dashboard.
  2. Navigate to Yandex Data Processing.
  3. Click the name of your cluster and select the Jobs tab.
  4. Click the job name.

If you do not have the Yandex Cloud CLI yet, install and initialize it.

The folder used by default is the one specified when creating the CLI profile. To change the default folder, use the yc config set folder-id <folder_ID> command. You can also specify a different folder for any command using --folder-name or --folder-id. If you access a resource by its name, the search will be limited to the default folder. If you access a resource by its ID, the search will be global, i.e., through all folders based on access permissions.

To get general info about a job, run this command:

yc dataproc job get \
   --cluster-name=<cluster_name> \
   --name=<job_name>

You can get the cluster ID and name with the list of clusters in the folder.

Call the get API method and provide the following in the request:

  • Cluster ID in the clusterId parameter. You can get it with the list of clusters in the folder.
  • Job ID in the jobId parameter. You can get it with the list of cluster jobs.

Getting job execution logsGetting job execution logs

Note

You can view the job logs and search data in them using Yandex Cloud Logging. For more information, see Working with logs.

Management console
CLI
API
  1. Open the folder dashboard.
  2. Navigate to Yandex Data Processing.
  3. Click the name of your cluster and select the Jobs tab.
  4. Click the job name.

If you do not have the Yandex Cloud CLI yet, install and initialize it.

The folder used by default is the one specified when creating the CLI profile. To change the default folder, use the yc config set folder-id <folder_ID> command. You can also specify a different folder for any command using --folder-name or --folder-id. If you access a resource by its name, the search will be limited to the default folder. If you access a resource by its ID, the search will be global, i.e., through all folders based on access permissions.

To get the job execution logs, run the following command:

yc dataproc job log \
   --cluster-name=<cluster_name> \
   --name=<job_name>

You can get the cluster ID and name with the list of clusters in the folder.

Call the API listLog method and provide the following in the request:

  • Cluster ID in the clusterId parameter. You can get it with the list of clusters in the folder.
  • Job ID in the jobId parameter. You can get it with the list of cluster jobs.

Was the article helpful?

Previous
PySpark jobs
Next
MapReduce jobs
© 2026 Direct Cursus Technology L.L.C.