Deploying HabanaAI Operator
On this Page
Deploying HabanaAI Operator¶
This section provides guidelines on how to install HabanaAI Operator on OpenShift and create a DeviceConfig instance.
You can install HabanaAI Operator either by using RedHat OpenShift Console or by using the CLI. Both methods are described below.
Using RedHat OpenShift Console
Go to Operators.
Click OperatorHub.
In All Items field, search for Habana AI.
Click Install.
Using the CLI
Create
habana-ai-operator-install.yaml
file containing the following:
---
apiVersion: v1
kind: Namespace
metadata:
name: habana-ai-operator
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: habana-ai-operator
namespace: habana-ai-operator
spec:
targetNamespaces:
- habana-ai-operator
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: habana-ai-operator
namespace: habana-ai-operator
spec:
channel: stable
installPlanApproval: Automatic
name: habana-ai-operator
source: certified-operators
sourceNamespace: openshift-marketplace
Apply the yaml file.
oc apply -f habana-ai-operator-install.yaml
Creating the DeviceConfig Instance¶
The DeviceConfig is the main Custom Resource Definition (CRD) of the HabanaAI Operator.
The table below describes the required fields for creating the DeviceConfig instance.
Component |
Field |
Description |
Scheme |
Required |
---|---|---|---|---|
devicePlugin |
image |
The Intel Gaudi device plugin image to be used. |
String |
True |
version |
The Intel Gaudi device plugin version to be used. |
String |
True |
|
driver |
image |
The Intel Gaudi driver image to be used. |
String |
True |
version |
The Intel Gaudi driver version to be used. |
String |
True |
|
habanaRuntime |
image |
The Intel Gaudi container runtime image to be used. |
String |
True |
version |
The Intel Gaudi container runtime version to be used. |
String |
True |
|
nodeMetrics |
image |
The Intel Gaudi node metrics image to be used. |
String |
True |
version |
The Intel Gaudi node metrics version to be used. |
String |
True |
You can create the DeviceConfig instance either by using RedHat OpenShift Console or by using the CLI. Both methods are described below.
Using RedHat OpenShift Console
Go to Operators.
Click Installed Operators.
In the Name field, define the instance as
habana-ai
.Under devicePlugin:
In the image field, add the Intel Gaudi device plugin image to use:
vault.habana.ai/docker-k8s-device-plugin/docker-k8s-device-plugin
.In the version field, define the Intel Gaudi device plugin version to use.
Under driver:
In the image field, add the Intel Gaudi driver image to use:
image-registry.openshift-image-registry.svc:5000/habana-ai-operator/habana-ai-driver
.In the version field, define the Intel Gaudi driver version to use. Note: If the image does not exist, it will be created and pushed to the specified image:version.
Under habanaRuntime:
In the image field, add the Intel Gaudi runtime image to use:
vault.habana.ai/habana-ocp-operator/<Version>/habana-runtime
.In the version field, define the Intel Gaudi runtime version to use.
Under nodeMetrics:
In the image field, add the Intel Gaudi node metrics image to use:
vault.habana.ai/gaudi-metric-exporter/metric-exporter
.In the version field, define the Intel Gaudi node metrics version to use.
Using the CLI
Create
deviceconfig.yaml
file containing the following:
apiVersion: habana.ai/v1
kind: DeviceConfig
metadata:
name: habana-ai
namespace: habana-ai-operator
spec:
devicePlugin:
image: vault.habana.ai/docker-k8s-device-plugin/docker-k8s-device-plugin
version: [WANTED_HABANA_DEVICE_PLUGIN_VERSION]
driver:
image: image-registry.openshift-image-registry.svc:5000/habana-ai-operator/habana-ai-driver
version: [WANTED_HABANA_DRIVER_VERSION]
habanaRuntime:
image: vault.habana.ai/habana-ocp-operator/[WANTED_HABANA_RUNTIME_VERSION_UNTIL_DASH]/habana-runtime
version: [WANTED_HABANA_RUNTIME_VERSION]
nodeMetrics:
image: vault.habana.ai/gaudi-metric-exporter/metric-exporter
version: [WANTED_HABANA_NODE_METRICS_VERSION]
The driver image is created inside the cluster itself and saved into Openshift’s internal registry - image-registry.openshift-image-registry.svc:5000
.
To load the image from another registry, replace the URL.
Apply the yaml file.
oc apply -f deviceconfig.yaml
Apply the following patches to allow for the Image Registry Setup
oc patch configs.imageregistry.operator.openshift.io cluster --type merge --patch '{"spec":{"storage":{"emptyDir":{}}}}'
oc patch configs.imageregistry.operator.openshift.io/cluster --patch '{"spec":{"managementState":"Managed"}}' --type=merge
until oc get svc image-registry -n openshift-image-registry; do sleep 10; done
oc patch configs.imageregistry.operator.openshift.io/cluster --patch '{"spec":{"managementState":"Unmanaged"}}' --type=merge