Working with Docker images
You can configure the environment for running your code using Docker images.
Yandex DataSphere enables you to create repositories of Docker images and select an image to use in a project. The selected image will be used when running code in all project notebooks.
Creating a Docker image
-
Select the project in your community or on the DataSphere home page
in the Recent projects tab. -
In the top-right corner, click Create resource. In the pop-up window, select Docker.
-
Fill in the fields as follows:
-
Project storage mounting point: Path to the project directory where the created Docker image will reside;
.indicates the root directory. -
Docker name: Image name, e.g.,
tensorflow. -
Tag: Image tag, e.g.,
1.0.0. -
Docker template: Template of the script to install Python.
-
Docker file: Instructions for creating a Docker image.
Edit the contents of the field. For example, the following code will create a Docker image with
python_3_8based on the original TensorFlow image:FROM tensorflow/tensorflow:2.7.0-gpu RUN set -e \ && useradd -ms /bin/bash --uid 1000 jupyter \ && pip install --no-cache-dir --upgrade pip \ && pip install --no-cache-dir nptyping==1.4.4 pandas==1.4.1 opencv-python-headless==4.5.5.62 scikit-learn==1.0.2 \ && ln -s /usr/bin/python3 /usr/local/bin/python3
-
-
Optionally, under Authentication data, enter your Docker Hub account username and password.
-
Click Build.
This will create a Docker image with TensorFlow packages to enable GPU-based computations.
-
Select the project in your community or on the DataSphere home page
in the Recent projects tab. -
Under Project resources, select
Docker.
Tip
The Docker Hub
- Create a subnet.
- Create an egress NAT gateway.
- Create a service account with the
vpc.userrole. - In the project settings, add the subnet and the service account you created.
You can also use basic images from other libraries.
Applying a Docker image to a project
-
Select the project in your community or on the DataSphere home page
in the Recent projects tab. -
Under Project resources, select
Docker. -
Next to the image in question, click
and select Activate. -
Open the project in JupyterLab and wait for it to load.
-
Open the notebook tab and make sure the custom image environment is available in your project. For example, for the TensorFlow image, create and run a cell with the following code:
#!g1.1 import tensorflow as tf tf.config.list_physical_devices('GPU')Result:
... [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
To return to the default environment, follow these steps:
-
Select the project in your community or on the DataSphere home page
in the Recent projects tab. - Under Project resources, select
Docker. - Next to the default image, click
and select Activate.
Sharing a Docker image
Only a community admin can share a Docker image in the community.
To learn more about roles in DataSphere, see Access management in DataSphere.
-
Select the project in your community or on the DataSphere home page
in the Recent projects tab. - Under Project resources, click
Docker. - Select the image from the list.
- Go to the Access tab.
- Enable the visibility option next to the name of the community you want to share the Docker image in.
To make a Docker image available for use in a different project, the project admin needs to add that image on the Shared tab.
Deleting a Docker image
-
Select the project in your community or on the DataSphere home page
in the Recent projects tab. - Under Project resources, click
Docker. - In the list of Docker images, select the one you want to delete.
- Click
and select Delete. - Click Confirm.
You will see a message saying that the resource has been deleted.
Warning
In fact, resource deletion can take up to 72 hours.