Yandex Cloud
Search
Contact UsGet started
  • Blog
  • Pricing
  • Documentation
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • ML & AI
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Customer Stories
    • Gateway to Russia
    • Cloud for Startups
    • Education and Science
  • Blog
  • Pricing
  • Documentation
Yandex project
© 2025 Yandex.Cloud LLC
Yandex Managed Service for Kubernetes
  • Comparison with other Yandex Cloud services
  • Getting started
    • All tutorials
    • Creating a new Kubernetes project in Yandex Cloud
    • Creating a Kubernetes cluster with no internet access
    • Running workloads with GPUs
    • Using node groups with GPUs and no pre-installed drivers
    • Setting up Time-Slicing GPUs
    • Migrating resources to a different availability zone
    • Using Yandex Cloud modules in Terraform
    • Encrypting secrets in Managed Service for Kubernetes
  • Access management
  • Pricing policy
  • Terraform reference
  • Monitoring metrics
  • Audit Trails events
  • Release notes

In this article:

  • Required paid resources
  • Getting started
  • Configure Time-Slicing GPUs
  • Test Time-Slicing GPUs
  • Delete the resources you created
  1. Tutorials
  2. Setting up Time-Slicing GPUs

Setting up Time-Slicing GPUs

Written by
Yandex Cloud
Improved by
Griner
Updated at May 5, 2025
  • Required paid resources
  • Getting started
  • Configure Time-Slicing GPUs
  • Test Time-Slicing GPUs
  • Delete the resources you created

The Time-Slicing GPUs plugin in Kubernetes is used to alternate workloads that run on a single GPU with oversubscription.

To install the Time-Slicing GPUs plugin in Managed Service for Kubernetes:

  1. Configure Time-Slicing GPUs.
  2. Test Time-Slicing GPUs.

If you no longer need the resources you created, delete them.

Required paid resourcesRequired paid resources

The support cost includes:

  • Managed Service for Kubernetes cluster fee: using the master and outgoing traffic (see Managed Service for Kubernetes pricing).
  • Cluster nodes (VM) fee: using computing resources, operating system, and storage (see Compute Cloud pricing).
  • Fee for a public IP address assigned to cluster nodes (see Virtual Private Cloud pricing).

Getting startedGetting started

  1. If you do not have the Yandex Cloud CLI yet, install and initialize it.

  2. The folder specified when creating the CLI profile is used by default. To change the default folder, use the yc config set folder-id <folder_ID> command. You can specify a different folder using the --folder-name or --folder-id parameter.

  3. Create security groups for the Managed Service for Kubernetes cluster and its node groups.

    Warning

    The configuration of security groups determines the performance and availability of the cluster and the services and applications running in it.

  4. Create a Managed Service for Kubernetes cluster. When creating them, specify the security groups prepared earlier.

  5. Create a Managed Service for Kubernetes node group with GPU NVIDIA® Tesla® T4 and the security groups prepared earlier.

  6. Install kubect and configure it to work with the new cluster.

Configure Time-Slicing GPUsConfigure Time-Slicing GPUs

  1. Create a time-slicing configuration:

    1. Prepare the time-slicing-config.yaml file with the following content:

      ---
      kind: Namespace
      apiVersion: v1
      metadata:
        name: gpu-operator
      
      ---
      apiVersion: v1
      kind: ConfigMap
      metadata:
        name: time-slicing-config
        namespace: gpu-operator
      data:
        a100-80gb: |-
          version: v1
          sharing:
            timeSlicing:
              resources:
              - name: nvidia.com/gpu
                replicas: 5
        tesla-t4: |-
          version: v1
          sharing:
            timeSlicing:
              resources:
              - name: nvidia.com/gpu
                replicas: 5
      
    2. Run this command:

      kubectl create -f time-slicing-config.yaml
      

      Result:

      namespace/gpu-operator created
      configmap/time-slicing-config created
      
  2. Install the GPU operator:

    helm repo add nvidia https://helm.ngc.nvidia.com/nvidia && \
    helm repo update && \
    helm install \
      --namespace gpu-operator \
      --create-namespace \
      --set devicePlugin.config.name=time-slicing-config \
      gpu-operator nvidia/gpu-operator
    
  3. Apply the time-slicing configuration to the Managed Service for Kubernetes cluster or node group:

    Managed Service for Kubernetes cluster
    Node group Managed Service for Kubernetes
    kubectl patch clusterpolicies.nvidia.com/cluster-policy \
      --namespace gpu-operator \
      --type merge \
      --patch='{"spec": {"devicePlugin": {"config": {"name": "time-slicing-config", "default": "tesla-t4"}}}}'
    
    yc managed-kubernetes node-group add-labels <node_group_ID_or_name> \
      --labels nvidia.com/device-plugin.config=tesla-t4
    

    You can get the ID and name of the Managed Service for Kubernetes node group with a list of node groups in your cluster.

Test Time-Slicing GPUsTest Time-Slicing GPUs

  1. Create a test app:

    1. Save the following app creation specification to a YAML file named nvidia-plugin-test.yml.

      Deployment is the Kubernetes API object that manages the replicated application.

      apiVersion: apps/v1
      kind: Deployment
      metadata:
        name: nvidia-plugin-test
        labels:
          app: nvidia-plugin-test
      spec:
        replicas: 5
        selector:
          matchLabels:
            app: nvidia-plugin-test
        template:
          metadata:
            labels:
              app: nvidia-plugin-test
          spec:
            tolerations:
              - key: nvidia.com/gpu
                operator: Exists
                effect: NoSchedule
            containers:
              - name: dcgmproftester11
                image: nvidia/samples:dcgmproftester-2.0.10-cuda11.0-ubuntu18.04
                command: ["/bin/sh", "-c"]
                args:
                  - while true; do /usr/bin/dcgmproftester11 --no-dcgm-validation -t 1004 -d 300; sleep 30; done
                resources:
                  limits:
                    nvidia.com/gpu: 1
                securityContext:
                  capabilities:
                    add: ["SYS_ADMIN"]
      
    2. Run this command:

      kubectl apply -f nvidia-plugin-test.yml
      

      Result:

      deployment.apps/nvidia-plugin-test created
      
  2. Make sure all the app's five Managed Service for Kubernetes pods are Running:

    kubectl get pods | grep nvidia-plugin-test
    
  3. Run the nvidia-smi command in the active nvidia-container-toolkit Managed Service for Kubernetes pod:

    kubectl exec <nvidia-container-toolkit_pod_name> \
      --namespace gpu-operator -- nvidia-smi
    

    Result:

    Defaulted container "nvidia-container-toolkit-ctr" out of: nvidia-container-toolkit-ctr, driver-validation (init)
    Thu Jan 26 09:42:51 2023
    +-----------------------------------------------------------------------------+
    | NVIDIA-SMI 515.65.01    Driver Version: 515.65.01    CUDA Version: N/A      |
    |-------------------------------+----------------------+----------------------+
    | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
    | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
    |                               |                      |               MIG M. |
    |===============================+======================+======================|
    |   0  Tesla T4            Off  | 00000000:8B:00.0 Off |                    0 |
    | N/A   72C    P0    70W /  70W |   1579MiB / 15360MiB |    100%      Default |
    |                               |                      |                  N/A |
    +-------------------------------+----------------------+----------------------+
    
    +-----------------------------------------------------------------------------+
    | Processes:                                                                  |
    |  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
    |        ID   ID                                                   Usage      |
    |=============================================================================|
    |    0   N/A  N/A     43108      C   /usr/bin/dcgmproftester11         315MiB |
    |    0   N/A  N/A     43211      C   /usr/bin/dcgmproftester11         315MiB |
    |    0   N/A  N/A     44583      C   /usr/bin/dcgmproftester11         315MiB |
    |    0   N/A  N/A     44589      C   /usr/bin/dcgmproftester11         315MiB |
    |    0   N/A  N/A     44595      C   /usr/bin/dcgmproftester11         315MiB |
    +-----------------------------------------------------------------------------+
    

Delete the resources you createdDelete the resources you created

Some resources are not free of charge. Delete the resources you no longer need to avoid paying for them:

  1. Delete the Managed Service for Kubernetes cluster.
  2. If you had created any service accounts, delete them.

Was the article helpful?

Previous
Using node groups with GPUs and no pre-installed drivers
Next
Migrating resources to a different availability zone
Yandex project
© 2025 Yandex.Cloud LLC