Driver and Software Installation

Installation Options¶

The following lists the available options for driver and software installation:

Install driver and software: Installs all the packages automatically using habanalabs-installer.sh script. This is the recommended installation method.
Upgrade driver and software: Enables upgrading an existing installation to the latest version.
Custom driver and software installation: Allows installing each package manually for a fine-grained control over the installation process.

Note

Make sure to review the currently supported versions and operating systems listed in the Support Matrix.
Driver and software installation is not required if you are using the Intel Gaudi Base Operator for Kubernetes or OpenShift.
Installing the package with internet connection available allows the network to download and install the required dependencies for the Intel Gaudi software package (apt get, yum, dnf install or pip install etc.).

Install Driver and Software¶

Install the driver and software using habanalabs-installer.sh script. For further details on the package installers included, see Intel Gaudi Software Installers table.
wget -nv https://vault.habana.ai/artifactory/gaudi-installer/1.21.2/habanalabs-installer.sh chmod +x habanalabs-installer.sh ./habanalabs-installer.sh install --type base
Note
- Make sure to update Linux kernel to the latest supported version before starting the installation of habanalabs-installer.sh script. If the kernel is updated during the driver installation, the script might fail. To resolve this, reboot the server to load the updated kernel, then rerun the habanalabs-installer.sh script to install the drivers.
- For further instructions on how to control the script attributes, refer to the help guide using ./habanalabs-installer.sh --help command.
- Adding --skip-driver-load option to the installation command skips loading the drivers.
- The installation sets the number of huge pages automatically.
- habanalabs-container-runtime and habanalabs-qual-workloads are not automatically installed with the habanalabs-installer.sh. Make sure to install them as shown in the steps below. Additionally, habanalabs-tools is not automatically installed. If you are using TPC and writing your own kernels, refer to TPC Tools Installation Guide to install habanalabs-tools package.
If needed, update the FW as described in Firmware Upgrade.
Install optional packages:
Ubuntu 24.04.2/22.04.5
Install habanalabs-container-runtime. package. This package is required for running workloads in containers. Both Docker and Kubernetes are supported:
sudo apt install -y habanalabs-container-runtime
Install habanalabs-qual-workloads package. This package is required for running ResNet-50 training stress test plugin:
sudo apt install -y habanalabs-qual-workloads
Install Python and MPI dependencies. This is required for running power stress and EDP tests:
habanalabs-installer.sh install -t deps -y -v
Install ethtool if you are running a multi-server scale-out and need to bring up the accelerator interfaces:
sudo apt install -y ethtool
RHEL 8.6/9.2/9.4/TencentOS3.1
Install habanalabs-container-runtime. package. This package is required for running workloads in containers. Both Docker and Kubernetes are supported:
sudo dnf install -y habanalabs-container-runtime
Install habanalabs-qual-workloads package. This package is required for running ResNet-50 training stress test plugin:
sudo dnf install -y habanalabs-qual-workloads
Install Python and MPI dependencies. This is required for running power stress and EDP tests:
habanalabs-installer.sh install -t deps -y -v
Install ethtool if you are running a multi-server scale-out and need to bring up the accelerator interfaces:
sudo dnf install -y ethtool
SUSE 15.5
Install habanalabs-container-runtime. package. This package is required for running workloads in containers. Both Docker and Kubernetes are supported:

sudo zypper install -y habanalabs-container-runtime

Install habanalabs-qual-workloads package. This package is required for running ResNet-50 training stress test plugin:

sudo zypper install -y habanalabs-qual-workloads
Install Python and MPI dependencies. This is required for running power stress and EDP tests:
habanalabs-installer.sh install -t deps -y -v
Install ethtool if you are running a multi-server scale-out and need to bring up the accelerator interfaces:
sudo zypper install -y ethtool

Bring up Accelerator Interfaces¶

If you are running a multi-server scale-out and have the accelerator interfaces physically connected, make sure the network interfaces are brought up. These interfaces need to be brought up every time the kernel module is loaded or unloaded and reloaded. A reference on how to bring up the interfaces is provided in the manage_network_ifs.sh. Note that the script can be found at /opt/habanalabs/qual/[gaudi3,gaudi2,gaudi1]/bin/.

Bring up accelerator interfaces:

# manage_network_ifs.sh requires ethtool
/opt/habanalabs/qual/[gaudi3,gaudi2,gaudi1]/bin/manage_network_ifs.sh --up

Check the accelerator interfaces status:

/opt/habanalabs/qual/[gaudi3,gaudi2,gaudi1]/bin/manage_network_ifs.sh --status

Output example:

accel0
3 ports up (2, 3, 7)
accel1
3 ports up (17, 20, 21)
accel2
3 ports up (14, 15, 19)
accel3
3 ports up (5, 8, 9)
accel4
3 ports up (17, 20, 21)
accel5
3 ports up (2, 3, 7)
accel6
3 ports up (5, 8, 9)
accel7
3 ports up (14, 15, 19)

Note

The accel[Number] label indicates the index assigned to the OAM by the OS, which may change after a reboot or driver reload. The label corresponds to the AIP, accel[Number], and index labels that the hl-smi tool outputs. For more details, see System Management Interface Tool (hl-smi).

Set Environment Variables¶

To ensure proper operation, the following environment variables must be set:

export HABANALABS_HLTHUNK_TESTS_BIN_PATH=/opt/habanalabs/src/hl-thunk/tests/
export HABANA_LOGS=/var/log/habana_logs/
export RDMA_CORE_ROOT=/opt/habanalabs/rdma-core/src
export HABANA_PLUGINS_LIB_PATH=/usr/lib/habanatools/habana_plugins
export GC_KERNEL_PATH=/usr/lib/habanalabs/libtpc_kernels.so
export RDMA_CORE_LIB=/opt/habanalabs/rdma-core/src/build/lib
export HABANA_SCAL_BIN_PATH=/opt/habanalabs/engines_fw
export DATA_LOADER_AEON_LIB_PATH=/usr/lib/habanalabs/libaeon.so
export __python_cmd=python3

If these variables are already defined in the system-provided scripts, run the following command in your environment:

source /etc/profile.d/habanalabs*.sh

The command automatically exports all the variables listed above.

Enable IOMMU Passthrough¶

Enabling IOMMU passthrough is required only for Ubuntu 24.04.2/22.04.5 with Linux kernel 6.8. Skip this section if a different OS or kernel version is used.

To enable IOMMU passthrough:

Add GRUB_CMDLINE_LINUX_DEFAULT="iommu=pt intel_iommu=on" to /etc/default/grub.
Run sudo update-grub.
Reboot the system.

For more details, see Ubuntu documentation.

Set CPU Setting to Performance¶

The below is an example of setting the CPU to performance for Ubuntu:

#Get setting:
cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor

#Set setting:
echo performance | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor

Note

The CPU settings must be updated on bare metal before starting the container.

Update CPU Settings¶

This section describes how to update CPU settings on Gaudi 3 using Sapphire and Granite Rapids to optimize performance.

Upgrade Driver and Software¶

Upgrade the driver and software:

 wget -nv https://vault.habana.ai/artifactory/gaudi-installer/1.21.2/habanalabs-installer.sh
 chmod +x habanalabs-installer.sh
 ./habanalabs-installer.sh upgrade --type base

Perform Steps 2 from the previous section Install Driver and Software to complete the upgrade.

Custom Driver and Software Installation¶

To install each package individually, refer to Custom Driver and Software Installation.

Note

While you can install each package manually, using the habanalabs-installer.sh script is the recommended method for installation. For further details, see Driver and Software Installation.

Gaudi Documentation 1.21.1 documentation

On this Page

Driver and Software Installation¶

Installation Options¶

Install Driver and Software¶

Bring up Accelerator Interfaces¶

Set Environment Variables¶

Enable IOMMU Passthrough¶

Set CPU Setting to Performance¶

Update CPU Settings¶

Upgrade Driver and Software¶

Custom Driver and Software Installation¶