Solving problems with dependencies in Yandex DataSphere
By default, DataSphere already contains popular machine learning packages and libraries. Library versions depend on the system image specified in the project settings. For the full list of installed packages, see List of pre-installed software.
Tip
If your project uses multiple libraries and you encounter conflicts with pre-installed libraries, see Building your own Docker image. In DataSphere, usage of the virtual environment and console is limited.
What are problems with dependencies?
Conflicts of library versions occur when two packages require different versions of the same library. Non-matching library versions make package installation more difficult and may cause errors when running the code.
If the packages you need are missing in the standard DataSphere image, install them manually.
Some packages depend on the system libraries that you cannot install in DataSphere due to the restriction on using sudo
and apt
. In such cases, you will have to find workarounds.
If you encounter errors when installing or using packages, this may be due to conflicting versions of the dependent libraries. These errors include such messages as ModuleNotFoundError
(missing module) or VersionConflict
(version incompatibility).
When an error occurs during package installation, pip
usually outputs a detailed message specifying the reason. For example, in case of a version conflict, pip
will specify the package and version that have caused the problem. Analyze these messages to understand the root cause of the problem.
Installing, deleting, or updating packages
To avoid conflicts, you can install a specific version of a package. For example, to install seaborn
0.11.1, use the following command:
%pip install seaborn==0.11.1
Updating packages to the latest versions may sometimes help you solve problems with dependencies. To update a package, use this command:
%pip install --upgrade <package_name>
If you do not need the conflicting package, you can delete it:
%pip uninstall <package_name>
To avoid conflicts, specify the minimum required versions of the packages:
%pip install <package_name>>=<minimum_version>
Note
After installing, updating, or deleting a package, restart the JupyterLab kernel. To do this, click Kernel → Restart Kernel in the top panel of the project window.
Using dependency files
Using the requirements.txt
dependency file, you can create the list of all required packages and their versions for a project. Doing so will simplify dependency installation on other systems and help you avoid issues when running a cross-system environment migration.
To install packages and libraries listed in the requirements.txt
file located in the project root, run this command:
%pip install -r requirements.txt
To save the list of installed libraries to the requirements.txt
file, run this command:
%pip freeze > requirements.txt
Note
If you want to deploy the environment from the dependency file created in DataSphere on another platform, delete the system packages installed via @
.
Using external repositories
If the package you need is not available in PyPI, you can install it directly from a repository, e.g., GitHub:
%pip install git+https://github.com/username/repository
You can specify a particular branch or commit to install:
%pip install git+https://github.com/username/repository@branch_name
Building your own Docker image
Your own Docker image will allow you to set up an environment with the dependencies and tools you need, accelerate setting up new projects, and ensure stability of the environment. When creating your own Docker image, you can:
- Use your own image prepared in advance.
- Use clean Python images without pre-installed dependencies.
- Install tools via
apt
. - Use library and driver versions that are different from the versions pre-installed in DataSphere, e.g., install another CUDA version.
- Quickly install large libraries or download files.
To learn how to build your own image, see Working with Docker images.
Known issues
ModuleNotFoundError
ModuleNotFoundError
occurs when a package is not installed. To make sure the package is installed, run %pip install <package_name>
. If you have just installed the package, restart the JupyterLab kernel.
VersionConflict
The VersionConflict
error occurs when incompatible package versions are installed in the system. Check packages and install compatible versions:
%pip install <package_name>==<required_version>
After reinstalling the package, restart the JupyterLab kernel.
Could not find a version that satisfies the requirement
The Could not find a version that satisfies the requirement
error may occur if you specified a non-existing package version or the package is not available in the repository. Make sure the package name and version are correct and try installing another version.
What else you can do
If you have encountered an error you cannot solve, contact support