Enabling Plugins
On this Page
Enabling Plugins¶
To start training, the Intel® Gaudi® AI accelerator, EFA Device and Intel Gaudi MPI Operator should be enabled as explained in the following sections.
Enable Gaudi Device¶
To enable Gaudi devices, run the Intel Gaudi device plugin on all the nodes that are equipped with Gaudi by deploying the following DaemonSet:
kubectl create -f https://vault.habana.ai/artifactory/docker-k8s-device-plugin/habana-k8s-device-plugin.yaml
Check the device plugin deployment status by running the following command:
kubectl get pods -n habana-system
Enable EFA Device Plugin¶
To enable EFA, run the EFA device plugin by deploying the following DaemonSet:
kubectl apply -f https://raw.githubusercontent.com/aws-samples/aws-efa-eks/main/manifest/efa-k8s-device-plugin.yml
Check the device plugin deployment status by running the following command:
kubectl get pods -A
Enable Intel Gaudi MPI Operator for MPIJob¶
To enable MPIJob type for multi-node cluster, install MPI Operator. For further information, refer to Kubeflow mpi-operator installation guide.