Working with Docker images
You can configure the environment to run your code using Docker images.
Yandex DataSphere lets you create repositories of Docker images in a project and select an image for the project. The selected image will be used when running code in all project notebooks.
Creating a Docker image
-
Select the relevant project in your community or on the DataSphere homepage
in the Recent projects tab. -
In the top-right corner, click Create resource. In the pop-up window, select Docker.
-
Fill in the fields as follows:
-
Build path: Path inside the project where the created Docker image will be stored;
.
indicates the root directory. -
Docker name: Image name, e.g.,
tensorflow
. -
Tag: Image tag, e.g.,
1.0.0
. -
Docker template: Template of the script to install Python.
-
Docker file: A set of instructions for creating a Docker image.
Edit the contents of the field. For example, the following code will create a Docker image with
python_3_8
based on the original TensorFlow image:FROM tensorflow/tensorflow:2.7.0-gpu RUN set -e \ && useradd -ms /bin/bash --uid 1000 jupyter \ && pip install --no-cache-dir --upgrade pip \ && pip install --no-cache-dir nptyping==1.4.4 pandas==1.4.1 opencv-python-headless==4.5.5.62 scikit-learn==1.0.2 \ && ln -s /usr/bin/python3 /usr/local/bin/python3
-
-
(Optional) Under Authentication data, enter your Docker Hub account username and password.
-
Click Build.
This will create a Docker image with TensorFlow packages for the use of the GPU in computations.
-
Select the relevant project in your community or on the DataSphere homepage
in the Recent projects tab. -
Under Project resources, select:
Docker.
Tip
The Docker Hub
- Create a subnet.
- Create an egress NAT gateway.
- Create a service account with the
vpc.user
role. - In the project settings, add the subnet and service account.
You can also use basic images from other libraries.
Applying a Docker image to a project
-
Select the relevant project in your community or on the DataSphere homepage
in the Recent projects tab. -
Under Project resources, select:
Docker. -
Next to the image you need, click
and select Activate. -
Open the project in JupyterLab and wait for it to load.
-
Open the notebook tab and check that the custom image environment is available in your project. For example, for the TensorFlow image, create and run a cell with the following code:
#!g1.1 import tensorflow as tf tf.config.list_physical_devices('GPU')
Result:
... [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
To return to the default environment:
-
Select the relevant project in your community or on the DataSphere homepage
in the Recent projects tab. - Under Project resources, select:
Docker. - Next to the default image, click
and select Activate.
Sharing a Docker image
Only a community admin can share a Docker image in the community.
To learn more about roles that apply in DataSphere, see Access management in DataSphere.
-
Select the relevant project in your community or on the DataSphere homepage
in the Recent projects tab. - Under Project resources, click
Docker. - Select the image from the list.
- Go to the Access tab.
- Enable the visibility option next to the name of the community to share the Docker image in.
To make a Docker image available for use in another project, the project administrator should add it to the Shared tab.
Deleting a Docker image
-
Select the relevant project in your community or on the DataSphere homepage
in the Recent projects tab. - Under Project resources, click
Docker. - In the list of Docker images, select the one you want to delete.
- Click
and select Delete. - Click Confirm.
You will see a message saying that the resource has been deleted.
Warning
The actual deletion of resources can take up to 72 hours.