Connecting to an S3 storage
On your project page in the DataSphere interface, you can manage your connection to an S3 object storage by using the S3 Connector resource.
To connect to the object storage from the notebook code, follow this guide: Connecting to S3 using the boto3 library.
Note
Avoid using S3 storage in FUSE
Getting started
Get an access key from your S3 storage provider. To do this in Yandex Object Storage, follow these steps:
- Create a service account.
- Assign a role to the created account allowing either reads only or both reads and writes.
- Create an access key for the service account.
Creating an S3 connector
-
Select the relevant project in your community or on the DataSphere homepage
in the Recent projects tab. - (Optional) In the top-right corner, click Create resource. In the pop-up window, select Secret and create a secret with the secret part of a static access key for the service account. You can also create a secret when creating an S3 connector.
- In the top-right corner, click Create resource. In the pop-up window, select S3 Connector.
- Fill in the fields as follows:
-
Name: Name of the connector being created. The naming requirements are as follows:
- The name must be from 3 to 63 characters long.
- It may contain uppercase and lowercase Latin and Russian letters, numbers, hyphens, underscores, and spaces.
- The first character must be a letter. The last character cannot be a hyphen, underscore, or space.
-
(Optional) Description of the new connector.
-
Endpoint: Storage host. For Object Storage, this is
https://storage.yandexcloud.net/
. -
Bucket: Name of the storage bucket.
Warning
Do not use buckets with periods in their names for connection. You can learn more about buckets here.
-
Mount name: Name of the volume for mounting the bucket into the project file system. The naming requirements are as follows:
- The name must be from 3 to 63 characters long.
- It may contain lowercase Latin letters, numbers, and hyphens.
- The first character must be a letter and the last character cannot be a hyphen.
-
Static access key ID used to connect to storage.
-
Static access key: In the list, select a secret that contains the secret part of the static access key or create a new secret.
-
Mode: Object storage access mode, Read only or Read and write.
-
- Click Create.
Note
You need to set up a NAT gateway for any subnet linked to the project.
Attaching an S3 storage to a project
Go to the S3 connector page and click Activate. Once activated, the bucket will be available in the JupyterLab interface in the file manager in the /s3/
folder, and you can view it as a file system.
Using an S3 storage in a project
You can access files in the connected bucket from the project code. Choose the file you need in the connected S3 storage on the S3 Mounts
Detaching an S3 storage
- On the project page under Project resources, click S3 Connector.
- Select the desired connector and go to the resource page.
- Click Deactivate in the top right corner of the page.
You can attach the S3 storage to your project again when needed.
Sharing an S3 connector
Note
You can only share resources within a single organization between communities created in the same availability zone.
-
Select the relevant project in your community or on the DataSphere homepage
in the Recent projects tab. - Under Project resources, click
S3 Connector. - Select your S3 connector in the list.
- Go to the Access tab.
- Enable the visibility option next to the name of the community you want to share the S3 connector with.
To make an S3 connector available for use in another project, the project administrator should add it to the Shared tab.
Deleting an S3 connector
You can only delete a deactivated connector, unavailable to the community.
-
Select the relevant project in your community or on the DataSphere homepage
in the Recent projects tab. - Under Project resources, click
S3 Connector. - In the list of S3 connectors, select the one you want to delete. Click
and select Delete. - Click Confirm.
You will see a message saying that the secret has been deleted.
Warning
The actual deletion of resources can take up to 72 hours.