Updating node group OS
Starting with Kubernetes version 1.30, the node OS changed from Ubuntu 20.04 to Ubuntu 22.04. When you update node groups within these versions, new nodes are automatically created from an Ubuntu 22.04 VM image.
Note
The OS version update is available in the RAPID release channel. This upgrade will later become available in the REGULAR and STABLE release channels.
User resource updates
In Ubuntu 22.04, system libraries and Linux kernel headers were updated, so GPU driver compilation may not work for node groups with custom GPU drivers.
How the problem manifests itself:
- You get driver build errors.
- GPU cannot be detected.
- Node group update fails.
To avoid all this, make sure your GPU Operator and driver versions are compatible when preparing for the update:
- Update GPU Operator to version
24.9.x+. - Update your driver to version
550.144.03or higher. - Use precompiled drivers. Do it by setting
--driver.usePrecompiled=truewhen installing GPU Operator.
For more information on using a GPU with a custom driver, see Using node groups with GPUs and no pre-installed drivers.
Warning
As Ubuntu 22.04 uses the new Linux kernel 5.15, updating the OS may disrupt the operation of custom kernel modules compiled with DKMS
Preparation for migration
Before migrating your Kubernetes cluster to a new OS version, test the update on the new cluster:
-
Create a Managed Service for Kubernetes cluster and specify the
RAPIDrelease channel for it. -
Create a node group in the new cluster.
-
Test your apps, which may prove OS version-dependent, in the new cluster.
Check key load indicators:
- GPU load.
- App status monitoring.
- The functioning of monitoring agents and drivers.
How to check node OS version
All nodes in the group use the same basic OS version image. You can check the OS version using these commands:
kubectl get nodes -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.status.nodeInfo.osImage}{"\n"}{end}'
kubectl get node <node-name> -o jsonpath='{.status.nodeInfo.osImage}{"\n"}'