Deploying HabanaAI Operator¶

This section provides guidelines on how to install HabanaAI Operator on OpenShift and create a DeviceConfig instance.

You can install HabanaAI Operator using the RedHat OpenShift console or CLI as described below.

Using RedHat OpenShift Console¶

Go to Operators.
Click “OperatorHub”.
In All Items field, search for Habana AI.
Click “Install”.

Using CLI¶

Create habana-ai-operator-install.yaml file containing the following:

---
apiVersion: v1
kind: Namespace
metadata:
   name: habana-ai-operator
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
   name: habana-ai-operator
   namespace: habana-ai-operator
spec:
   targetNamespaces:
   - habana-ai-operator
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
   name: habana-ai-operator
   namespace: habana-ai-operator
spec:
   channel: stable
   installPlanApproval: Automatic
   name: habana-ai-operator
   source: certified-operators
   sourceNamespace: openshift-marketplace

Apply the yaml file:

oc apply -f habana-ai-operator-install.yaml

Creating the DeviceConfig Instance¶

The DeviceConfig is the main Custom Resource Definition (CRD) of the HabanaAI Operator.

The table below describes the required fields for creating the DeviceConfig instance:

Component	Field	Description	Scheme	Required
devicePlugin	image	The Intel Gaudi device plugin image to be used.	String	True
devicePlugin	version	The Intel Gaudi device plugin version to be used.	String	True
driver	image	The Intel Gaudi driver image to be used.	String	True
driver	version	The Intel Gaudi driver version to be used.	String	True
habanaRuntime	image	The Intel Gaudi container runtime image to be used.	String	True
habanaRuntime	version	The Intel Gaudi container runtime version to be used.	String	True
nodeMetrics	image	The Intel Gaudi node metrics image to be used.	String	True
nodeMetrics	version	The Intel Gaudi node metrics version to be used.	String	True

You can create the DeviceConfig instance using the RedHat OpenShift console or by using CLI as described below.

Using RedHat OpenShift Console¶

Go to Operators.
Click “Installed Operators”.
In the Name field, define the instance as habana-ai.
Under devicePlugin:
1. In the image field, add the Intel Gaudi device plugin image to use:
  vault.habana.ai/docker-k8s-device-plugin/docker-k8s-device-plugin
2. In the version field, define the Intel Gaudi device plugin version to use.
Under driver:
1. In the image field, add the Intel Gaudi driver image to use:
  image-registry.openshift-image-registry.svc:5000/habana-ai-operator/habana-ai-driver
2. In the version field, define the Intel Gaudi driver version to use.
  
  Note
  
  If the image does not exist, it will be created and pushed to the specified image:version.
Under habanaRuntime:
1. In the image field, add the Intel Gaudi runtime image to use:
  vault.habana.ai/habana-ocp-operator/<Version>/habana-runtime
2. In the version field, define the Intel Gaudi runtime version to use.
Under nodeMetrics:
1. In the image field, add the Intel Gaudi node metrics image to use:
  vault.habana.ai/gaudi-metric-exporter/metric-exporter
2. In the version field, define the Intel Gaudi node metrics version to use.

Using CLI¶

Create deviceconfig.yaml file containing the following:

apiVersion: habana.ai/v1
kind: DeviceConfig
metadata:
   name: habana-ai
   namespace: habana-ai-operator
spec:
   devicePlugin:
      image: vault.habana.ai/docker-k8s-device-plugin/docker-k8s-device-plugin
      version: [WANTED_HABANA_DEVICE_PLUGIN_VERSION]
   driver:
      image: image-registry.openshift-image-registry.svc:5000/habana-ai-operator/habana-ai-driver
      version: [WANTED_HABANA_DRIVER_VERSION]
   habanaRuntime:
      image: vault.habana.ai/habana-ocp-operator/[WANTED_HABANA_RUNTIME_VERSION_UNTIL_DASH]/habana-runtime
      version: [WANTED_HABANA_RUNTIME_VERSION]
   nodeMetrics:
      image: vault.habana.ai/gaudi-metric-exporter/metric-exporter
      version: [WANTED_HABANA_NODE_METRICS_VERSION]

The driver image is created inside the cluster itself and saved into Openshift’s internal registry - image-registry.openshift-image-registry.svc:5000. To load the image from another registry, replace the URL.

Apply the yaml file:
```
oc apply -f deviceconfig.yaml
```

Apply the following patches to allow for the Image Registry Setup:

oc patch configs.imageregistry.operator.openshift.io cluster --type merge --patch '{"spec":{"storage":{"emptyDir":{}}}}'
oc patch configs.imageregistry.operator.openshift.io/cluster --patch '{"spec":{"managementState":"Managed"}}' --type=merge
until oc get svc image-registry -n openshift-image-registry; do sleep 10; done
oc patch configs.imageregistry.operator.openshift.io/cluster --patch '{"spec":{"managementState":"Unmanaged"}}' --type=merge

Gaudi Documentation 1.16.2 documentation

Deploying HabanaAI Operator

On this Page

Deploying HabanaAI Operator¶

Using RedHat OpenShift Console¶

Using CLI¶

Creating the DeviceConfig Instance¶

Using RedHat OpenShift Console¶

Using CLI¶