DataSphere Inference
DataSphere provides tools for releasing services available to third-party resources. You can deploy a model trained in DataSphere and use the same tools to develop a fully featured service based on a Docker image.
To publish a service, DataSphere provides special resources: nodes and aliases.
Nodes
A node is an isolated group of specially created VMs (instances) the computing load is distributed across. Node instances are created with a preset environment and fixed interpreter state. Depending on your needs, you can select different instance configurations.
Warning
When deploying and using models, you pay for the uptime of each node instance: from its start to deletion.
If you no longer need the service you deployed, delete the node.
You can access the nodes via the API. API requests can change the state of the node interpreter. To return to the initial state, you will have to recreate the entire node.
Note
The maximum size of a request to and a response from the node API is 16 MB.
To create a node, specify your organization's cloud folder where the node will deploy its instances and store its logs. In the folder, set up a subnet with internet access via a NAT gateway and create a service account with the vpc.user
role. Specify this service account in the DataSphere project settings.
Node from a model
With nodes from models, you can deploy your models saved in DataSphere as a service accessible via the API. You can create a node from the following models:
If you want to deploy a model that is not supported by the Triton server, you can convert
To create a node from a model trained outside of DataSphere, load the model from the file to a variable in the notebook and then create a DataSphere model from this variable.
DataSphere uses Triton Inference Server
Note
When deploying PyTorch models, DataSphere cannot automatically figure out the input and output parameters.
Node from a Docker image
Nodes deployed from a Docker image hosted in a container registry will run as a fully featured service. The Docker image does not have to contain a model trained in DataSphere. You can create any image and place it in any registry you find appropriate. To learn how to push a Docker image to a Yandex Container Registry registry, see Pushing a Docker image to a registry.
Note
To use Yandex Container Registry, the project service account needs the container-registry.images.puller
role.
When creating a node from a Docker image, you set the node's API, port you want your service to use, connection time, format of metrics you will collect, and other parameters. Once the node is created, DataSphere will monitor its state, maintain the operation of the instances, and scale the node within the specified instance range as needed. For instances, you can use the ru-central1-a
and ru-central1-b
availability zones.
To work with a node based on a large model or Docker image, you can connect an additional disk from 10 to 4,096 GB. If a node has multiple instances, a disk will be created for each one.
Node statuses
A DataSphere node can have one of the following statuses:
Healthy
: Number of instances with theHealthy
status in the node is equal to the minimum number of required instances.Unhealthy
: Number of instances with theHealthy
status in the node is below the allowed minimum.Created
: Node has just been created.Suspended
: Node is paused.Deleting
: Node is being deleted.
Instance statuses
Node instances can have one of the following statuses:
Healthy
: Instance is healthy and available for balancing.Unhealthy
: There are issues with the instance and it has been excluded from balancing.Created
: VM has been created for the instance.Started
: Connection has been established with the instance's VM.Preparing
: Instance is getting ready to process requests.Deleting
: Instance is being deleted.Undefined
: Initial state of the instance; VM is not created yet.
Alias
An alias is a special resource used for publishing and updating a service. It allows you to replace nodes and update the running service without affecting the user experience.
Create an alias and use it as your service endpoint. You can update related nodes, balance the load across them, and remove deprecated Docker image versions without affecting the user experience.