Resource relationships in DataSphere
DataSphere runs within Yandex Cloud organizations. Any created DataSphere entities are organization resources. You cannot exchange resources between organizations.
Communities is a way to organize team work. Communities determine the scope of projects and resources in DataSphere.
You can create a community in one of the availability zones. All projects and resources created in the community will also be hosted in this availability zone. You can use a different availability zone only to host nodes. Once a community is created, you cannot move it to a different zone.
Projects are the main workspaces in DataSphere. Projects store code, variables, installed software, and other information.
DataSphere resources are objects that are created or used in projects: datasets, Docker images, nodes, and more.
DataSphere resources
You can use the following types of resources in DataSphere projects:
- Datasets: Ways of storing information that provide quick access to large amounts of data within a project.
- Secrets: Key-value pairs that store private data (tokens, keys, etc.) in encrypted form. Secrets are created in a project and assigned to it. You can use secrets as environment variables in a cell.
- Docker images: OS environments with certain software, libraries, environment variables, and configuration files.
- Connectors to S3 storages: Saved configurations for connecting Object Storage buckets. You can mount buckets into a project's file system to make it easier to access code data. To learn how to create an S3 connector, see Connecting to an S3 storage.
- Nodes: Services deployed for running trained models. Third-party services can access nodes using the API.
- Aliases: Add-ons used to publish services. Aliases allow you to distribute the load across nodes and update the deployed services on the fly.
- Yandex Data Processing templates: Ready-to-use Yandex Data Processing cluster configurations for automatic cluster deployment from a DataSphere project.
- Models: Saved interpreter state and computation or training results. These are grouped into models trained within projects and fine-tuned foundation models.
- Spark connectors: Saved configurations for connecting existing Yandex Data Processing clusters and creating temporary clusters.
Shared use of projects and resources
To share projects and resources, DataSphere allows you to publish resources in communities.
By publishing a resource, you grant resource access to all users in a community. This allows them to use the resource in their projects within the community. You can publish resources in a project's community and other communities within the organization.
Exchanging resources between communities enables different teams of the same organization to share Docker images, datasets, and other objects.
DataSphere communities, projects, and resources are only visible inside an organization. You cannot exchange resources between organizations. You also cannot share a resource in a community that was created in a different availability zone.
You can share the resources of the DataSphere project in which you have at least the Editor
role in any organization community where you have the Developer
role or higher. You can open the access on the Access tab on the resource view page. For more information, see Access management in DataSphere.
Relationships between DataSphere resources and Yandex Cloud services
DataSphere communities are organization resources. One organization may have multiple communities.
To pay for DataSphere, use a Yandex Cloud billing account.
Other Yandex Cloud services are accessed through folders that store the resources of a specific Yandex Cloud service. To work with Yandex Cloud services, use service accounts.