Deploying HabanaAI Operator

This section provides guidelines on how to install HabanaAI Operator on OpenShift and create a DeviceConfig instance.

You can install HabanaAI Operator either by using RedHat OpenShift Console or by using the CLI. Both methods are described below.

Using RedHat OpenShift Console

  1. Go to Operators.

  2. Click OperatorHub.

  3. In All Items field, search for Habana AI.

  4. Click Install.

../../_images/HabanaAI_Operator_Installation.png

Using the CLI

  1. Create habana-ai-operator-install.yaml file containing the following:

---
apiVersion: v1
kind: Namespace
metadata:
   name: habana-ai-operator
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
   name: habana-ai-operator
   namespace: habana-ai-operator
spec:
   targetNamespaces:
   - habana-ai-operator
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
   name: habana-ai-operator
   namespace: habana-ai-operator
spec:
   channel: stable
   installPlanApproval: Automatic
   name: habana-ai-operator
   source: certified-operators
   sourceNamespace: openshift-marketplace
  1. Apply the yaml file.

oc apply -f habana-ai-operator-install.yaml

Creating the DeviceConfig Instance

The DeviceConfig is the main Custom Resource Definition (CRD) of the HabanaAI Operator.

The table below describes the required fields for creating the DeviceConfig instance.

Component

Field

Description

Scheme

Required

devicePlugin

image

The Intel Gaudi device plugin image to be used.

String

True

version

The Intel Gaudi device plugin version to be used.

String

True

driver

image

The Intel Gaudi driver image to be used.

String

True

version

The Intel Gaudi driver version to be used.

String

True

habanaRuntime

image

The Intel Gaudi container runtime image to be used.

String

True

version

The Intel Gaudi container runtime version to be used.

String

True

nodeMetrics

image

The Intel Gaudi node metrics image to be used.

String

True

version

The Intel Gaudi node metrics version to be used.

String

True

You can create the DeviceConfig instance either by using RedHat OpenShift Console or by using the CLI. Both methods are described below.

Using RedHat OpenShift Console

  1. Go to Operators.

  2. Click Installed Operators.

  3. In the Name field, define the instance as habana-ai.

  4. Under devicePlugin:

    1. In the image field, add the Intel Gaudi device plugin image to use: vault.habana.ai/docker-k8s-device-plugin/docker-k8s-device-plugin.

    2. In the version field, define the Intel Gaudi device plugin version to use.

  5. Under driver:

    1. In the image field, add the Intel Gaudi driver image to use: image-registry.openshift-image-registry.svc:5000/habana-ai-operator/habana-ai-driver.

    2. In the version field, define the Intel Gaudi driver version to use. Note: If the image does not exist, it will be created and pushed to the specified image:version.

  6. Under habanaRuntime:

    1. In the image field, add the Intel Gaudi runtime image to use: vault.habana.ai/habana-ocp-operator/<Version>/habana-runtime.

    2. In the version field, define the Intel Gaudi runtime version to use.

  7. Under nodeMetrics:

    1. In the image field, add the Intel Gaudi node metrics image to use: vault.habana.ai/gaudi-metric-exporter/metric-exporter.

    2. In the version field, define the Intel Gaudi node metrics version to use.

../../_images/Create_Device_Config_Instance.png

Using the CLI

  1. Create deviceconfig.yaml file containing the following:

apiVersion: habana.ai/v1
kind: DeviceConfig
metadata:
   name: habana-ai
   namespace: habana-ai-operator
spec:
   devicePlugin:
     image: vault.habana.ai/docker-k8s-device-plugin/docker-k8s-device-plugin
     version: [WANTED_HABANA_DEVICE_PLUGIN_VERSION]
   driver:
     image: image-registry.openshift-image-registry.svc:5000/habana-ai-operator/habana-ai-driver
     version: [WANTED_HABANA_DRIVER_VERSION]
   habanaRuntime:
     image: vault.habana.ai/habana-ocp-operator/[WANTED_HABANA_RUNTIME_VERSION_UNTIL_DASH]/habana-runtime
     version: [WANTED_HABANA_RUNTIME_VERSION]
   nodeMetrics:
     image: vault.habana.ai/gaudi-metric-exporter/metric-exporter
     version: [WANTED_HABANA_NODE_METRICS_VERSION]

The driver image is created inside the cluster itself and saved into Openshift’s internal registry - image-registry.openshift-image-registry.svc:5000. To load the image from another registry, replace the URL.

  1. Apply the yaml file.

oc apply -f deviceconfig.yaml
  1. Apply the following patches to allow for the Image Registry Setup

oc patch configs.imageregistry.operator.openshift.io cluster --type merge --patch '{"spec":{"storage":{"emptyDir":{}}}}'
oc patch configs.imageregistry.operator.openshift.io/cluster --patch '{"spec":{"managementState":"Managed"}}' --type=merge
until oc get svc image-registry -n openshift-image-registry; do sleep 10; done
oc patch configs.imageregistry.operator.openshift.io/cluster --patch '{"spec":{"managementState":"Unmanaged"}}' --type=merge