Working with Docker images
You can configure the environment to run your code using Docker images.
Yandex DataSphere lets you create repositories of Docker images in a project and select an image for the project. The selected image will be used when running code in all project notebooks.
Creating a Docker image
-
Select the relevant project in your community or on the DataSphere homepage
in the Recent projects tab. -
In the top-right corner, click Create resource. In the pop-up window, select Docker.
-
Fill out the fields below:
-
Build path: Path inside the project where the created Docker image will be stored.
.
is the root directory. -
Docker name: Image name, e.g.,
tensorflow
. -
Tag: Image tag, e.g.,
1.0.0
. -
Docker template: Template of the script used to install Python (
python_3_7
orpython_3_8
). -
Docker file: A set of instructions for creating a Docker image.
Edit the contents of the field. For example, the following code will create a Docker image with
python_3_8
based on the original TensorFlow image:FROM tensorflow/tensorflow:2.7.0-gpu RUN set -e \ && useradd -ms /bin/bash --uid 1000 jupyter \ && pip install --no-cache-dir --upgrade pip \ && pip install --no-cache-dir nptyping==1.4.4 pandas==1.4.1 opencv-python-headless==4.5.5.62 scikit-learn==1.0.2 \ && ln -s /usr/bin/python3 /usr/local/bin/python3
-
-
(Optional) Under Docker Hub authentication data, enter your Docker Hub account username and password.
-
Click Build.
This will create a Docker image with TensorFlow packages for the use of the GPU in computations.
-
Select the relevant project in your community or on the DataSphere homepage
in the Recent projects tab. -
Under Project resources, select
Docker.
Tip
The Docker Hub
- Create a subnet.
- Create an egress NAT gateway.
- Create a service account with the
vpc.user
role. - In the project settings, add the subnet and service account.
You can also use basic images from other libraries.
Applying a Docker image to a project
-
Select the relevant project in your community or on the DataSphere homepage
in the Recent projects tab. -
Under Project resources, select
Docker. -
Click
next to the desired image and select Activate. -
Open the project in JupyterLab and wait for it to load.
-
Open the notebook tab and check that the custom image environment is available in your project. For example, for the TensorFlow image, create and run a cell with the following code:
#!g1.1 import tensorflow as tf tf.config.list_physical_devices('GPU')
Result:
... [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
To return to the default environment:
-
Select the relevant project in your community or on the DataSphere homepage
in the Recent projects tab. - Under Project resources, select
Docker. - Click
next to the default image and select Activate.
Warning
The Python 3.7 system image won't work with the g2.x (GPU A100) configurations.
Sharing a Docker image
Only a community admin can share a Docker image in the community.
To learn more about roles that apply in DataSphere, see Access management in DataSphere.
-
Select the relevant project in your community or on the DataSphere homepage
in the Recent projects tab. - Under Project resources, click
Docker. - Select the image from the list.
- Go to the Access tab.
- Enable the visibility option next to the name of the community to share the Docker image in.
To make a Docker image available for use in another project, the project administrator should add it to the Shared tab.
Deleting a Docker image
-
Select the relevant project in your community or on the DataSphere homepage
in the Recent projects tab. - Under Project resources, click
Docker. - In the list of Docker images, select the one you want to delete.
- Click
and select Delete. - Click Confirm.
You will see a message saying that the resource has been deleted.