Yandex Cloud
Search
Contact UsGet started
  • Blog
  • Pricing
  • Documentation
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • ML & AI
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Customer Stories
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Education and Science
    • Yandex Cloud Partner program
  • Blog
  • Pricing
  • Documentation
© 2025 Direct Cursus Technology L.L.C.
Yandex Managed Service for Kubernetes
  • Comparison with other Yandex Cloud services
  • Getting started
    • All tutorials
    • Creating a new Kubernetes project in Yandex Cloud
    • Creating a Kubernetes cluster with no internet access
    • Running workloads with GPUs
    • Using node groups with GPUs and no pre-installed drivers
    • Setting up Time-Slicing GPUs
    • Migrating resources to a different availability zone
    • Using Yandex Cloud modules in Terraform
    • Encrypting secrets in Managed Service for Kubernetes
  • Access management
  • Pricing policy
  • Terraform reference
  • Monitoring metrics
  • Audit Trails events
  • Release notes

In this article:

  • Required paid resources
  • Getting started
  • Install the GPU Operator
  • Check that drivers are installed correctly
  • Delete the resources you created
  1. Tutorials
  2. Using node groups with GPUs and no pre-installed drivers

Using node groups with GPUs and no pre-installed drivers

Written by
Yandex Cloud
Improved by
Dmitry A.
Updated at May 5, 2025
  • Required paid resources
  • Getting started
  • Install the GPU Operator
  • Check that drivers are installed correctly
  • Delete the resources you created

You can use Managed Service for Kubernetes node groups for workloads on GPUs without pre-installed drivers. Use the GPU Operator to select a suitable driver version.

To prepare your cluster and Managed Service for Kubernetes node group without pre-installed drivers for running workloads:

  1. Install the GPU operator.
  2. Check that drivers are installed correctly.

If you no longer need the resources you created, delete them.

Required paid resourcesRequired paid resources

The support cost includes:

  • Fee for the Managed Service for Kubernetes cluster: using the master and outgoing traffic (see Managed Service for Kubernetes pricing).
  • Cluster nodes (VM) fee: using computing resources, operating system, and storage (see Compute Cloud pricing).
  • Fee for a public IP address assigned to cluster nodes (see Virtual Private Cloud pricing).

Getting startedGetting started

  1. If you do not have the Yandex Cloud CLI yet, install and initialize it.

    The folder specified when creating the CLI profile is used by default. To change the default folder, use the yc config set folder-id <folder_ID> command. You can specify a different folder using the --folder-name or --folder-id parameter.

  2. Create security groups for the Managed Service for Kubernetes cluster and its node groups.

    Warning

    The configuration of security groups determines the performance and availability of the cluster and the services and applications running in it.

  3. Create a Managed Service for Kubernetes cluster with any suitable configuration. When creating them, specify the security groups prepared earlier.

  4. Create a node group on a platform with a GPU, and enable Do not install GPU drivers. Specify the security groups prepared earlier.

  5. Install kubect and configure it to work with the new cluster.

Install the GPU OperatorInstall the GPU Operator

  1. Install Helm v3.8.0 or higher.

  2. Install the GPU Operator:

    helm repo add nvidia https://helm.ngc.nvidia.com/nvidia && \
    helm repo update && \
    helm install \
      --namespace gpu-operator \
      --create-namespace \
      --set driver.version=<driver_version> \
      gpu-operator nvidia/gpu-operator
    

    Where driver.version is the NVIDIA® driver version. You can omit the driver version parameter. In this case, the default version will be used.

    Note

    For the Managed Service for Kubernetes AMD EPYC™ with NVIDIA® Ampere® A100 (gpu-standard-v3) node group platform, use driver version 515.48.07.

Check that drivers are installed correctlyCheck that drivers are installed correctly

Get the nvidia-driver-daemonset pod logs:

DRIVERS_POD_NAME="$(kubectl get pods --namespace gpu-operator | grep nvidia-driver-daemonset | awk '{print $1}')" && \
kubectl --namespace gpu-operator logs "${DRIVERS_POD_NAME}"

They should contain a message saying that the driver has been installed successfully, e.g.:

Defaulted container "nvidia-driver-ctr" out of: nvidia-driver-ctr, k8s-driver-manager (init)
DRIVER_ARCH is x86_64
Creating directory NVIDIA-Linux-x86_64-535.54.03
Verifying archive integrity... OK
Uncompressing NVIDIA Accelerated Graphics Driver for Linux-x86_64 535.54.03

...

Loading NVIDIA driver kernel modules...
+ modprobe nvidia
+ modprobe nvidia-uvm
+ modprobe nvidia-modeset

...

Done, now waiting for signal

Now, you can run GPU-based workloads by following the Running workloads with GPUs guide.

Delete the resources you createdDelete the resources you created

Some resources are not free of charge. Delete the resources you no longer need to avoid paying for them:

  1. Delete the Kubernetes cluster.
  2. If you had created any service accounts, delete them.

Was the article helpful?

Previous
Running workloads with GPUs
Next
Setting up Time-Slicing GPUs
© 2025 Direct Cursus Technology L.L.C.