Intel Gaudi Device Plugin for Kubernetes
On this Page
Intel Gaudi Device Plugin for Kubernetes¶
This is a Kubernetes device plugin implementation that enables the registration of the Gaudi device in a container cluster for compute workload. With the appropriate hardware and this plugin deployed in your Kubernetes cluster, you can run jobs on the Gaudi device.
The Intel Gaudi device plugin for Kubernetes is a DaemonSet that allows you to automatically:
Enable the registration of Gaudi devices in your Kubernetes cluster.
Keep track of device health.
Note
Make sure to review the supported Kubernetes versions listed in the Support Matrix.
Make sure Intel Gaudi software drivers are loaded on the system. To load the drivers, run:
sudo modprobe habanalabs && sudo modprobe habanalabs_cn && sudo modprobe habanalabs_ib && sudo modprobe habanalabs_en
Deploying Intel Gaudi Device Plugin for Kubernetes¶
Run the device plugin on all Gaudi nodes by deploying the DaemonSet using the
kubectl create
command. Use thehabana-k8s-device-plugin.yaml
file to set up the environment:kubectl create -f https://vault.habana.ai/artifactory/docker-k8s-device-plugin/habana-k8s-device-plugin.yaml
Note
kubectl
requires access to a Kubernetes cluster to implement its commands. To check the access tokubectl
command, runkubectl get pod -A
.Check the device plugin deployment status by running the following command:
kubectl get pods -n habana-system
Expected result:
NAME READY STATUS RESTARTS AGE habanalabs-device-plugin-daemonset-qtpnh 1/1 Running 0 2d11h