NVIDIA® GPU support

Written by

Updated at April 8, 2026

Main components
GPU monitoring
Using GPUs in pods
Configuration
- MIG Manager settings
See also

Stackland enables you to provision NVIDIA® GPUs in a Stackland cluster using NVIDIA® GPU support, a component which automates management of GPU resources and ensures their availability for workloads. An implementation of the NVIDIA® GPU Operator, it provides a comprehensive toolkit for GPU provisioning in Kubernetes.

NVIDIA® GPU support use cases include:

Auto-detection of GPUs on cluster nodes
Provisioning GPUs as Kubernetes resources for pods
Support for GPU virtualization technologies (multi-instance GPU or MIG)
Support for NVLink to create GPU clusters
GPU health monitoring and metric collection

NVIDIA® GPU support requires NVIDIA® GPU nodes to operate.

Main components

NVIDIA® driver

Version: 580.126

The NVIDIA® driver provides a low-level interface between the OS and GPU. The driver exposes the GPU hardware capabilities, manages device memory, and handles commands from applications.

NVIDIA® Container Toolkit

Version: 580.126

NVIDIA® Container Toolkit enables running GPU-accelerated containers. The toolkit integrates with the container runtime and provides GPU access to containers via the Container Device Interface (CDI). This component automatically configures the container environment, mounts the required libraries and devices, and manages GPU resource isolation across containers.

NVIDIA® Fabric Manager

Version: 580.126

NVIDIA® Fabric Manager manages NVLink and NVSwitch in multi-GPU systems. This component ensures high-speed GPU interconnection, optimizes communication topology, and manages distributed memory in multi-GPU configurations.

NVIDIA® Operator

Version: 25.10

The NVIDIA® NVIDIA® GPU support automates GPU management in a Kubernetes cluster. It creates, configures, and manages components required for GPU provisioning, including drivers, libraries, device plugins, and monitoring systems. The NVIDIA® GPU Operator uses CRDs to manage the lifecycle of GPU components.

NVIDIA® DCGM Dashboard: Overview dashboard with metrics of all cluster GPUs.
NVIDIA® DCGM Dashboard with MIG metrics: Dashboard for MIG GPU monitoring.
NVIDIA® DCGM Dashboard w/o MIG metrics: Dashboard for non-MIG GPU monitoring.

Using GPUs in pods

To use a GPU in a pod, specify the nvidia.com/gpu resource in the container specification:

apiVersion: v1
kind: Pod
metadata:
  name: gpu-pod
spec:
  containers:
  - name: cuda-container
    image: nvidia/cuda:12.0-base
    resources:
      limits:
        nvidia.com/gpu: 1

Kubernetes will automatically place the pod on a node with an available GPU.

Configuration

MIG Manager settings

migManager:
  enabled: false
  strategy: "single"
  config:
    default: "all-disabled"

enabled: Enables multi-instance GPU support.
strategy: MIG strategy. The possible values are single to apply the same MIG configuration to all GPUs on the node or mixed to use different MIG configurations on different GPUs.
config.default: Default MIG configuration.

To enable MIG support, set enabled to true and configure the GPU node:

kubectl label nodes my-node nvidia.com/mig.config=all-1g.5gb --overwrite

This command applies a MIG profile to my-node to partition each of the node’s GPU into multiple independent GPUs, each with one compute slice and 5 GB of video memory.

To view all available MIG profiles, run the following command:

kubectl -n stackland-nvidia-gpu get cm default-mig-parted-config -o jsonpath='{.data.config\.yaml}'

NVIDIA® GPU support

Main components

NVIDIA® driver

NVIDIA® Container Toolkit

NVIDIA® Fabric Manager

NVIDIA® Operator

DCGM

DCGM Exporter

GPU monitoring

Using GPUs in pods

Configuration

MIG Manager settings

See also

Was the article helpful?

NVIDIA® GPU support

Main componentsMain components

NVIDIA® driverNVIDIA® driver

NVIDIA® Container ToolkitNVIDIA® Container Toolkit

NVIDIA® Fabric ManagerNVIDIA® Fabric Manager

NVIDIA® OperatorNVIDIA® Operator

DCGMDCGM

DCGM ExporterDCGM Exporter

GPU monitoringGPU monitoring

Using GPUs in podsUsing GPUs in pods

ConfigurationConfiguration

MIG Manager settingsMIG Manager settings

See alsoSee also

Was the article helpful?

Main components

NVIDIA® driver

NVIDIA® Container Toolkit

NVIDIA® Fabric Manager

NVIDIA® Operator

DCGM

DCGM Exporter

GPU monitoring

Using GPUs in pods

Configuration

MIG Manager settings

See also