Yandex Cloud
Search
Contact UsGet started
  • Blog
  • Pricing
  • Documentation
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • ML & AI
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Customer Stories
    • Start testing with double trial credits
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Education and Science
    • Yandex Cloud Partner program
  • Blog
  • Pricing
  • Documentation
© 2025 Direct Cursus Technology L.L.C.
Yandex DataSphere
  • Getting started
    • About Yandex DataSphere
    • DataSphere resource relationships
    • Communities
    • Cost management
    • Project
    • Computing resource configurations
      • Jobs
      • DataSphere CLI
      • Docker images in jobs
      • Job runtime environment
      • Rerunning jobs
      • Integration with Managed Service for Apache Airflow™
      • Working with Spark connectors
    • Foundation models
    • Quotas and limits
    • Special terms for educational institutions
  • Terraform reference
  • Audit Trails events
  • Access management
  • Pricing policy
  • Public materials
  • Release notes

In this article:

  • DataSphere CLI commands
  • Running jobs
  • Viewing job information
  • Canceling a job
  • Setting job data lifetime
  • Generating job environment parameters
  • Getting a list of community projects
  • Getting information about a project
  • Viewing DataSphere CLI version
  • Viewing the DataSphere CLI changelog
  • Job logs
  1. Concepts
  2. DataSphere Jobs
  3. DataSphere CLI

DataSphere CLI

Written by
Yandex Cloud
Updated at January 23, 2025
  • DataSphere CLI commands
    • Running jobs
    • Viewing job information
    • Canceling a job
    • Setting job data lifetime
    • Generating job environment parameters
    • Getting a list of community projects
    • Getting information about a project
    • Viewing DataSphere CLI version
    • Viewing the DataSphere CLI changelog
  • Job logs

You run DataSphere Jobs jobs using the DataSphere CLI utility.

To install DataSphere CLI, in a Python virtual environment, run the following command:

pip install datasphere

As soon as the installation is complete, you can view help by running this command with the -h flag:

datasphere -h

The result will be as follows:

usage: datasphere [-h] [-t TOKEN] [-l {ERROR,WARNING,INFO,DEBUG}] [--log-config LOG_CONFIG] [--log-dir LOG_DIR] [--profile PROFILE] {version,changelog,project,generate-requirements} ...

positional arguments:
  {version,changelog,project,generate-requirements}
    version             Show version
    changelog           Show changelog
    generate-requirements
                        Generate requirements for specified root module(s)

options:
  -h, --help            show this help message and exit
  -t TOKEN, --token TOKEN
                        YC OAuth token, see https://yandex.cloud/en/docs/iam/concepts/authorization/oauth-token
  -l {ERROR,WARNING,INFO,DEBUG}, --log-level {ERROR,WARNING,INFO,DEBUG}
                        Logging level
  --log-config LOG_CONFIG
                        Custom logging config
  --log-dir LOG_DIR     Logs directory (temporary directory by default)
  --profile PROFILE     `yc` utility profile

DataSphere CLI commandsDataSphere CLI commands

Use the following commands to manage jobs and the utility:

  • Running and restoring a job session.
  • Viewing job information.
  • Canceling a job.
  • Setting job data lifetime.
  • Generating job environment parameters.
  • Getting a list of community projects.
  • Getting information about a project.
  • Viewing DataSphere CLI version.
  • Viewing DataSphere CLI changelogs.

Running jobsRunning jobs

To run the job, run the following command:

datasphere project job execute -p <project_ID> -c <configuration_file>

Where:

  • <project_ID>: ID of the DataSphere project in which you are going to run the job.
  • <configuration_file>: Path to the job configuration file.

To run jobs under a service account, authenticate in the Yandex Cloud CLI as this service account and add it to the DataSphere project's member list with the datasphere.community-projects.developer role. If you run a job with the Yandex Compute Cloud VM, link the service account to it.

Running a job locks the shell session until the job completes. The job code operation logs will be output to the standard stdout output and stderr error streams. The job execution system logs will be written to a separate file in the user's working directory.

If the shell session is interrupted during job execution, the job will continue to run in DataSphere, but the execution logs will not be saved. To resume logging, recover the session by running the following command:

datasphere project job attach --id <job_ID>

You can find out the job ID in the DataSphere UI under the DataSphere Jobs tab on the project page.

Tracking and logging will resume after the job session is restored.

To rerun the job, run the following command:

datasphere project job fork

Viewing job informationViewing job information

You can view all past and current project jobs by running the following command:

datasphere project job list -p <project_ID>

The response will return a table with the following fields:

  • Job ID.
  • Name.
  • Description.
  • Status.
  • Job start and end date (if already completed).
  • Name of the user who ran the job.

To view information about a specific job, run the following command:

datasphere project job get --id <job_ID>

Canceling a jobCanceling a job

You can stop and cancel a job in two ways:

  1. If you have a shell session running with a job in progress, click Ctrl + C.

  2. If you want to stop a job that is not related with an active shell session, run the following command:

    datasphere project job cancel --id <job_ID>
    

The running job will be stopped.

Setting job data lifetimeSetting job data lifetime

You can set the job data lifetime by running the command below:

datasphere project job set-data-ttl --id <job_ID> --days <number_of_days>

Where --days is the number of days after which the job data will be deleted (14 days by default).

Generating job environment parametersGenerating job environment parameters

To generate the environment parameters for your job, run the following command:

datasphere generate-requirements <root_module>

Where <root_module> is the job root module.

The response will return a file named requirements.txt with a list of environment parameters for the specified module. You can use the list in the job configuration file to explicitly specify dependencies.

Getting a list of community projectsGetting a list of community projects

To view all the community projects, run this command:

datasphere project list -c <community_ID>

The response will return a table with the following fields:

  • Project ID
  • Project name
  • Community ID

Getting information about a projectGetting information about a project

To view information about a specific project, run the following command:

datasphere project get --id <project_ID>

The response will return a table with the following fields:

  • Project ID
  • Project name
  • Community ID

Viewing DataSphere CLI versionViewing DataSphere CLI version

To view the current DataSphere CLI version, run this command:

datasphere version

Note

Each time you use DataSphere CLI, a version check is performed. If a new version is out, the utility will notify you accordingly. To avoid compatibility issues, upgrade DataSphere CLI as new versions become available.

Viewing the DataSphere CLI changelogViewing the DataSphere CLI changelog

To view the changes in the current DataSphere CLI version, run this command:

datasphere changelog

Job logsJob logs

When running a job through DataSphere CLI, the shell first notifies the user to save the logs in the user's working directory. For example:

2024-05-16 12:42:35,447 - [INFO] - logs file path: C:\Temp\datasphere\job_2024-05-16T12-42-35.427056

After running the job, you can find the following files in the user's working directory:

  • stdout.txt: Standard output stream of the user program.
  • stderr.txt: Standard error message stream.
  • system.log: System log of the VM configuration and environment package installation.
  • log.txt: General DataSphere CLI log which records the progress of the job.
  • docker_stats.tsv: Log of the resources consumed by the Docker image, such as utilized CPU power, read and write speeds, used RAM, and boot speed. You can also get this information by running the docker stats command.
  • gpu_stats.tsv: Log of GPU utilization, which includes the number of cores, utilized power, and video memory.

To change the directory for storing logs, use the following command:

datasphere --log-dir <new_directory>

You can upload your job results by running this command:

datasphere project job download-files --id <job_ID>

See alsoSee also

  • Running jobs in DataSphere Jobs
  • Using results of completed jobs

Was the article helpful?

Previous
Jobs
Next
Docker images in jobs
© 2025 Direct Cursus Technology L.L.C.