NVIDIA driver update guide

Written by

Updated at December 3, 2025

Supported drivers and recommendations
Shared NVSwitch virtualization model
Why use driver version 535?
CUDA update
- Ubuntu installation example
Issue with sudo rebootafter updating the driver to a version higher than 535 and the recommended workaround

Warning

This guide covers the gpu-standard-v3 (AMD EPYC™ with NVIDIA® Ampere® A100) and gpu-standard-v3i platforms.

For gpu-standard-v3i, you can only use an image with the NVIDIA 535 driver and secure-boot support. The video driver cannot be updated on this platform; you can only update the CUDA library.

Supported drivers and recommendations

In Yandex Cloud, the gpu-standard-v3 (AMD EPYC™ with NVIDIA® Ampere® A100) and gpu-standard-v3i VMs are preconfigured with the NVIDIA 535 driver.
We recommend using this specific driver version; driver updates to other versions are not supported and may lead to unstable GPU performance.

Shared NVSwitch virtualization model

We use the Shared NVSwitch virtualization model described in NVIDIA Fabric Manager User Guide.

NVSwitch devices are taken to a separate auxiliary VM and controlled by the NVIDIA 535 driver. When you start a guest VM, GPUs are preconfigured for NVLink; to keep this preconfiguration, you are not allowed to software reset GPUs from user VMs in Yandex Cloud.

If you update the user VM driver to another version, e.g., 570, the driver may fail to recognize the current GPU state. This is a NVIDIA driver limitation. This is why we do not recommend changing the user VM driver version.

Why use driver version 535?

NVIDIA publishes multiple driver branches (NVIDIA Data Center Drivers Overview):

LTSB (Long-Term Support Branch): Long-term support, security updates and fixes for 3 years.
PB (Production Branch): Main branch for data centers.
NFB (New Feature Branch): Drivers with new features.

Version 535 belongs to LTSB; it is validated and supported in the Yandex Cloud infrastructure. Drivers from other branches fail the compatibility check and may work incorrectly.

CUDA update

Oftentimes, it is not a new driver that you need but CUDA Toolkit update. In most cases, you do not need to update the driver, it is enough to install the required CUDA version and the cuda-compat package for compatibility with the 535 driver (CUDA Forward Compatibility).

Ubuntu installation example

Connect the NVIDIA CUDA repository:

sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu$(lsb_release -rs | sed -e 's/\.//')/x86_64/3bf863cc.pub
sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/ /"
sudo apt update

Install cuda-compat (example for CUDA 12.5):

sudo apt install -y cuda-compat-12-5
echo 'export LD_LIBRARY_PATH=/usr/local/cuda-12.5/compat:$LD_LIBRARY_PATH' >> ~/.bashrc && source ~/.bashrc

Check the current configuration:
```
nvidia-smi
```

Issue with `sudo reboot`after updating the driver to a version higher than 535 and the recommended workaround

When you reinstall the driver followed by sudo reboot, the driver does not have enough time to upload correctly. And as Yandex Cloud prohibits GPU software reset, the card remains invalid. While this does not cause any hardware issues, the VM will operate incorrectly. Use the yc compute instance restart command instead of sudo reboot.

This is why we do not recommend updating the driver to a version higher than 535. If you need to install a driver version higher than 535 and reboot the user VM, use the following workaround scenario:

Install the driver.

Script for Ubuntu

#!/bin/bash
set -e

# Fixing the architecture
arch="x86_64"

# Figuring out the Ubuntu version (20.04 -> ubuntu2004, 22.04 -> ubuntu2204, etc.)
. /etc/os-release
if [[ "$ID" != "ubuntu" ]]; then
  echo "This script is for Ubuntu only!"
  exit 1
fi
distro="ubuntu${VERSION_ID//./}"

echo "Using the repository: $distro/$arch"

# 1. Downloading the package with keys
wget https://developer.download.nvidia.com/compute/cuda/repos/${distro}/${arch}/cuda-keyring_1.1-1_all.deb

# 2. Installing the keys
sudo dpkg -i cuda-keyring_1.1-1_all.deb || {
  echo "Failed to install cuda-keyring, performing alternative steps..."
  
  # 2a. Uploading the GPG key manually
  wget https://developer.download.nvidia.com/compute/cuda/repos/${distro}/${arch}/cuda-archive-keyring.gpg
  
  # 2b. Putting the key in the correct location
  sudo mv cuda-archive-keyring.gpg /usr/share/keyrings/cuda-archive-keyring.gpg
  
  # 2c. Connecting the CUDA repository manually
  echo "deb [signed-by=/usr/share/keyrings/cuda-archive-keyring.gpg] \
  https://developer.download.nvidia.com/compute/cuda/repos/${distro}/${arch}/ /" \
  | sudo tee /etc/apt/sources.list.d/cuda-${distro}-${arch}.list
}

# 3. Updating the list of packages
sudo apt update

# 4. Installing NVIDIA drivers
sudo apt install -y nvidia-open

# 5. Installing the CUDA driver metapackage
sudo apt install -y cuda-drivers

You need to go through the next steps before you reboot the system via sudo reboot.
Create a script named /usr/libexec/manage-nvidia:

#!/bin/bash
set -eu
usage() {
         echo "usage: manage-nvidia (load|unload)"
         exit 1
}
[ $# -eq 1 ] || usage
case "$1" in
         load)   modprobe nvidia ;;
         unload) modprobe -r nvidia_uvm nvidia_drm nvidia_modeset nvidia ;;
         *)      usage ;;
esac

Make the script executable:

sudo chmod +x /usr/libexec/manage-nvidia

Create a systemd unit named /etc/systemd/system/manage-nvidia.service:

[Unit]
Description=Manage NVIDIA driver
Before=nvidia-persistenced.service

[Service]
Type=oneshot
ExecStart=/usr/libexec/manage-nvidia load
RemainAfterExit=true
ExecStop=/usr/libexec/manage-nvidia unload
StandardOutput=journal

[Install]
WantedBy=multi-user.target
RequiredBy=nvidia-persistenced.service

Reload the systemd configuration, configure manage-nvidia to autorun on boot, and start the service itself:

sudo systemctl daemon-reload
sudo systemctl enable --now manage-nvidia

Expected output if the execution is successful:

Created symlink /etc/systemd/system/multi-user.target.wants/manage-nvidia.service → /etc/systemd/system/manage-nvidia.service.
Created symlink /etc/systemd/system/nvidia-persistenced.service.requires/manage-nvidia.service → /etc/systemd/system/manage-nvidia.service.

Check nvidia-persistenced.service for dependency on manage-nvidia.service:

sudo systemctl list-dependencies nvidia-persistenced | grep manage-nvidia

Result:

● ├─manage-nvidia.service

Check the service status:

sudo systemctl status manage-nvidia

With that done, during sudo reboot, systemd will call ExecStop for manage-nvidia, the driver will be uploaded correctly, and rebooting will not invalidate the GPU.

NVIDIA driver update guide

Supported drivers and recommendationsSupported drivers and recommendations

Shared NVSwitch virtualization modelShared NVSwitch virtualization model

Why use driver version 535?Why use driver version 535?

CUDA updateCUDA update

Ubuntu installation exampleUbuntu installation example

Issue with after updating the driver to a version higher than 535 and the recommended workaroundIssue with sudo rebootafter updating the driver to a version higher than 535 and the recommended workaround

Was the article helpful?