Installing SynapseAI SW Packages Individually

Installing the package with internet connection available allows the network to download and install the required dependencies for the SynapseAI package (apt get, yum install or pip install etc.). The installation contains the following Installers:

  • habanalabs-graph – installs the Graph Compiler and the run-time.

  • habanalabs-thunk – installs the thunk library.

  • habanalabs-dkms – installs the PCIe driver.

  • habanalabs-firmware - installs the Gaudi Firmware.

  • habanalabs-firmware-tools – installs various Firmware tools (hlml, hl-smi, etc).

  • habanalabs-qual – installs the qualification application package. See Qualification Library.

  • habanalabs-container-runtime - installs the container runtime library.

Note

Running the below commands installs the latest version only. You can install a version other than latest by running the below commands with a specific build number.

Package Retrieval:

  1. Download and install the public key:

curl -X GET https://vault.habana.ai/artifactory/api/gpg/key/public | sudo apt-key add --
  1. Get the name of the operating system:

lsb_release -c | awk '{print $2}'
  1. Create an apt source file /etc/apt/sources.list.d/artifactory.list with deb https://vault.habana.ai/artifactory/debian <OS name from previous step> main content.

  2. Update Debian cache:

sudo dpkg --configure -a

sudo apt-get update

KMD Dependencies:

  1. Install Deb libraries

sudo apt install dkms  libelf-dev
  1. Install headers:

sudo apt install linux-headers-$(uname -r)
  1. After kernel upgrade, reboot your machine.

Firmware Installation:

Install the Firmware:

sudo apt install -y habanalabs-firmware

Driver Installation:

The habanalabs-dkms_all package installs both the habanalabs and habanalabs_en (Ethernet) drivers. If automation scripts are used, the scripts must be modified to load/unload both drivers.

On kernels 5.12 and later, you can load/unload the two drivers in no specific order. On kernels below 5.12, the habanalabs_en driver must be loaded before the habanalabs driver and unloaded after the habanalabs driver.

  1. Run the below command to install both the habanalabs and habanalabs_en driver:

sudo apt install -y habanalabs-dkms
  1. Load the habanalabs_en driver first and the habanalabs driver after:

sudo modprobe <driver name>

Thunk Installation:

Install the thunk library:

sudo apt install -y habanalabs-thunk

FW Tools Installation:

Install Firmware tools:

sudo apt install -y habanalabs-firmware-tools

Graph Compiler and Run-time Installation:

Install the graph compiler and run-time:

sudo apt install -y habanalabs-graph

(Optional) Qual Installation:

Install hl_qual:

sudo apt install -y habanalabs-qual

For further details, see Gaudi Qualification Library.

Container Runtime Installation:

Install container runtime:

sudo apt install -y habanalabs-container-runtime

Update Environment Variables and More

When the installation is complete, close the shell and re-open it. Or, run the following:

source /etc/profile.d/habanalabs.sh

source ~/.bashrc

KMD Dependencies:

  1. Install Deb libraries

sudo apt install dkms  libelf-dev
  1. Install headers:

sudo apt install linux-headers-$(uname -r)
  1. After kernel upgrade, reboot your machine.

Firmware Installation:

Install the Firmware:

sudo apt install -y habanalabs-firmware

Driver Installation:

The habanalabs-dkms_all package installs both the habanalabs and habanalabs_en (Ethernet) drivers. If automation scripts are used, the scripts must be modified to load/unload both drivers.

On kernels 5.12 and later, you can load/unload the two drivers in no specific order. On kernels below 5.12, the habanalabs_en driver must be loaded before the habanalabs driver and unloaded after the habanalabs driver.

  1. Run the below command to install both the habanalabs and habanalabs_en driver:

sudo apt install -y habanalabs-dkms
  1. Load the habanalabs_en driver first and the habanalabs driver after:

sudo modprobe <driver name>

Thunk Installation:

Install the thunk library:

sudo apt install -y habanalabs-thunk

FW Tools Installation:

Install Firmware tools:

sudo apt install -y habanalabs-firmware-tools

Graph Compiler and Run-time Installation:

Install the graph compiler and run-time:

sudo apt install -y habanalabs-graph

(Optional) Qual Installation:

Install hl_qual:

sudo apt install -y habanalabs-qual

For further details, see Gaudi Qualification Library.

Container Runtime Installation:

Install container runtime:

sudo apt install -y habanalabs-container-runtime

Update Environment Variables and More

When the installation is complete, close the shell and re-open it. Or, run the following:

source /etc/profile.d/habanalabs.sh

source ~/.bashrc

Package Retrieval:

  1. Create /etc/yum.repos.d/Habana-Vault.repo with the following content:

[vault]

name=Habana Vault

baseurl=https://vault.habana.ai/artifactory/AmazonLinux2

enabled=1

gpgcheck=0

gpgkey=https://vault.habana.ai/artifactory/AmazonLinux2/repodata/repomod.xml.key

repo_gpgcheck=0
  1. Update YUM cache by running the following command:

sudo yum makecache
  1. Verify correct binding by running the following command:

yum search habana

This will search for and list all packages with the word Habana.

KMD Dependencies:

  1. Check your Linux kernel version:

uname -r
  1. Install headers:

sudo yum install kernel-devel
  1. After kernel upgrade, reboot your machine.

Additional Dependencies:

Add yum-utils:

sudo yum install -y yum-utils

Firmware Installation:

Install the Firmware:

sudo yum install -y habanalabs-firmware

Driver Installation:

The habanalabs-dkms_all package installs both the habanalabs and habanalabs_en (Ethernet) drivers. If automation scripts are used, the scripts must be modified to load/unload both drivers.

On kernels 5.12 and later, you can load/unload the two drivers in no specific order. On kernels below 5.12, the habanalabs_en driver must be loaded before the habanalabs driver and unloaded after the habanalabs driver.

The below commands installs/uninstalls both the habanalabs and habanalabs_en driver.

  1. (Recommended) Remove the previous driver package:

sudo yum remove habanalabs*
  1. Install the driver:

sudo yum install -y habanalabs
  1. Load the habanalabs_en driver first and the habanalabs driver after:

sudo modprobe <driver name>

Thunk Installation

Install the thunk library:

sudo yum install -y habanalabs-thunk

FW Tool Installation:

Install Firmware tools:

sudo yum install -y habanalabs-firmware-tools

Graph Compiler and Run-time Installation:

Install the graph compiler and run-time:

sudo yum install -y habanalabs-graph

(Optional) Qual Installation:

Install hl_qual:

sudo yum install -y habanalabs-qual

For further details, see Qualification Library.

Container Runtime Installation:

Install container runtime:

sudo yum install -y habanalabs-container-runtime

Update Environment Variables and More

When the installation is complete, close the shell and re-open it. Or, run the following:

source /etc/profile.d/habanalabs.sh

source ~/.bashrc

Package Retrieval:

  1. Create /etc/yum.repos.d/Habana-Vault.repo with the following content:

[vault]

name=Habana Vault

baseurl=https://vault.habana.ai/artifactory/rhel/8/8.6

enabled=1

repo_gpgcheck=0
  1. Update YUM cache by running the following command:

sudo yum makecache
  1. Verify correct binding by running the following command:

yum search habana

This will search for and list all packages with the word Habana.

  1. Reinstall libarchive package by following command:

sudo dnf install -y libarchive*

KMD Dependencies:

  1. Check your Linux kernel version:

uname -r
  1. Install headers:

sudo yum install kernel-devel
  1. After kernel upgrade, reboot your machine.

Additional Dependencies:

Add yum-utils:

sudo yum install -y yum-utils

Firmware Installation:

Install the Firmware:

sudo yum install -y habanalabs-firmware

Driver Installation:

The habanalabs-dkms_all package installs both the habanalabs and habanalabs_en (Ethernet) drivers. If automation scripts are used, the scripts must be modified to load/unload both drivers.

On kernels 5.12 and later, you can load/unload the two drivers in no specific order. On kernels below 5.12, the habanalabs_en driver must be loaded before the habanalabs driver and unloaded after the habanalabs driver.

The below commands installs/uninstalls both the habanalabs and habanalabs_en driver.

  1. (Recommended) Remove the previous driver package:

sudo yum remove habanalabs*
  1. Install the driver:

sudo yum install -y habanalabs
  1. Load the habanalabs_en driver first and the habanalabs driver after:

sudo modprobe <driver name>

Thunk Installation

Install the thunk library:

sudo yum install -y habanalabs-thunk

FW Tool Installation:

Install Firmware tools:

sudo yum install -y habanalabs-firmware-tools

Graph Compiler and Run-time Installation:

Install the graph compiler and run-time:

sudo yum install -y habanalabs-graph

(Optional) Qual Installation:

Install hl_qual:

sudo yum install -y habanalabs-qual

For further details, see Qualification Library.

Container Runtime Installation:

Install container runtime:

sudo yum install -y habanalabs-container-runtime

Update Environment Variables and More

When the installation is complete, close the shell and re-open it. Or, run the following:

source /etc/profile.d/habanalabs.sh

source ~/.bashrc

Set Number of Huge Pages

Some training models use huge pages. It is recommended to set the number of huge pages as provided below:

#set current hugepages
sudo sysctl -w vm.nr_hugepages=15000
#Remove old entry if exists in sysctl.conf
sudo sed --in-place '/nr_hugepages/d' /etc/sysctl.conf
#Insert huge pages settings to persist
echo "vm.nr_hugepages=15000" | sudo tee -a /etc/sysctl.conf

Bring up Network Interfaces

If training using Gaudi network interfaces for multi-node scaleout (external Gaudi network interfaces between servers), please ensure the network interfaces are brought up. These interfaces need to be brought up every time the kernel module is loaded or unloaded and reloaded.

Note

This section is not relevant for AWS users.

A reference on how to bring up the interfaces is provided in the manage_network_ifs.sh script as detailed in manage_network_ifssh.

Use the following commands:

# manage_network_ifs.sh requires ethtool
sudo apt-get install ethtool
./manage_network_ifs.sh --up