3. Installation Guide

3.1. Overview

This document describes how to obtain and install the SynapseAI software package and the TensorFlow software package.

For additional install and setup details, refer to the Setup and Install GitHub page.

3.1.1. Release Details

This release was tested and validated on the following configurations.

Distro

Version

Kernels

CPU Type

Ubuntu

18.04

4.15 and above

Intel x86_64

Ubuntu

20.04

5.4.0 and above

Intel x86_64

Amazon

Linux2

5.4.0 and above

Intel x86_64

Centos

7.8

4.9.184 and above

Intel x86_64

3.1.2. Release Versions

Components

Version

Build number

0.15.0-547

3.1.3. Package Content

The installation contains the following Installers:

  • habanalabs-firmware – installs the Firmware fit images.

  • habanalabs-graph-_all – installs the Graph Compiler and the run-time.

  • habanalabs-thunk-_all – installs the thunk library.

  • habanalabs-dkms_all – installs the PCIe driver.

  • habanalabs-firmware - installs the Gaudi Firmware.

  • habanalabs-firmware-tools – installs various Firmware tools (hlml, hl-smi, etc).

  • habanalabs-qual – installs the qualification application package. See Qualification Library.

  • habanalabs-container-runtime - installs the container runtime library.

Refer to Update your Software to obtain the latest installers according to the supported Operating Systems.

3.2. Ubuntu - Package Installation

Installing the package with internet connection available allows the network to download and install the required dependencies for the SynapseAI package (apt get and pip install etc.).

3.2.1. Package Retrieval

  1. Download and install the public key:

curl -X GET https://vault.habana.ai/artifactory/api/gpg/key/public | sudo apt-key add –-
  1. Create an apt source file /etc/apt/sources.list.d/artifactory.list with deb https://vault.habana.ai/artifactory/debian bionic main content.

  2. Update Debian cache:

sudo dpkg --configure -a

sudo apt-get update

3.2.1.1. KMD Dependencies

  1. Install Deb libraries

sudo apt install dkms  libelf-dev
  1. Install headers:

sudo apt install linux-headers-$(uname -r)
  1. After kernel upgrade, reboot your machine.

3.2.2. Firmware Installation

Install the Firmware:

sudo apt install -y habanalabs-firmware=0.15.0-547

3.2.3. Driver Installation

Install the driver:

sudo apt install -y habanalabs-dkms=0.15.0-547

3.2.4. Thunk Installation

Install the thunk library:

sudo apt install -y habanalabs-thunk=0.15.0-547

3.2.5. FW Tools Installation

Install Firmware tools:

sudo apt install -y habanalabs-firmware-tools=0.15.0-547

3.2.6. Graph Compiler and Run-time Installation

Install the graph compiler and run-time:

sudo apt install -y habanalabs-graph=0.15.0-547

3.2.7. (Optional) Qual Installation

Install hl_qual:

sudo apt install -y habanalabs-qual=0.15.0-547

For further details, see Gaudi Qualification Library.

3.2.8. Container Runtime Installation

  1. Install container runtime:

sudo apt install -y habanalabs-container-runtime=0.15.0-547
  1. Register habana runtime by adding the following to /etc/docker/daemon.json:

{
        "runtimes": {
            "habana": {
                "path": "/usr/bin/habana-container-runtime",
                "runtimeArgs": []
            }
        }
}
  1. Restart Docker:

sudo systemctl restart docker

3.2.9. Update Environment Variables and More

When the installation is complete, close the shell and re-open it. Or, run the following:

source /etc/profile.d/habanalabs.sh

source ~/.bashrc

3.3. Centos and Amazon - Package Installation

Installing the package with internet connection available allows the network to download and install the required dependencies for the SynapseAI package (yum install and pip install etc.).

3.3.1. Amazon Package Retrieval

  1. Create /etc/yum.repos.d/Habana-Vault.repo with the following content:

[vault]

name=Habana Vault

baseurl=https://vault.habana.ai/artifactory/AmazonLinux2

enabled=1

gpgcheck=0

gpgkey=https://vault.habana.ai/artifactory/AmazonLinux2/repodata/repomod.xml.key

repo_gpgcheck=0
  1. Update YUM cache by running the following command:

sudo yum makecache
  1. Verify correct binding by running the following command:

yum search habana

This will search for and list all packages with the word Habana.

3.3.2. Centos Package Retrieval

  1. Create /etc/yum.repos.d/Habana-Vault.repo with the following content:

[vault]

name=Habana Vault

baseurl=https://vault.habana.ai/artifactory/centos7

enabled=1

gpgcheck=0

gpgkey=https://vault.habana.ai/artifactory/centos7/repodata/repomod.xml.key

repo_gpgcheck=0
  1. Update YUM cache:

sudo yum makecache
  1. Verify correct binding:

yum search habana

This will search for and list all packages with the word Habana.

3.3.2.1. KMD Dependencies

  1. Check your Linux kernel version:

uname -r
  1. Install headers:

sudo yum install kernel-devel
  1. After kernel upgrade, reboot your machine.

3.3.2.2. Additional Dependencies

Add yum-utils:

sudo yum install -y sudo yum-utils

3.3.3. Firmware Installation

Install the Firmware:

sudo yum install habanalabs-firmware=0.15.0-547* -y

3.3.4. Driver Installation

  1. Remove the previous driver package:

sudo yum remove habanalabs*
  1. Install the driver:

sudo yum install habanalabs-0.15.0-547* -y

3.3.5. Thunk Installation

Install the thunk library:

sudo yum install habanalabs-thunk-0.15.0-547* -y

3.3.6. FW Tool Installation

Install Firmware tools:

sudo yum install habanalabs-firmware-tools-0.15.0-547* -y

3.3.7. Graph Compiler and Run-time Installation

Install the graph compiler and run-time:

sudo yum install habanalabs-graph-0.15.0-547* -y

3.3.8. (Optional) Qual Installation

Install hl_qual:

sudo yum install habanalabs-qual-0.15.0-547* -y

For further details, see Qualification Library.

3.3.9. Container Runtime Installation

  1. Install container runtime:

sudo yum install habanalabs-container-runtime-0.15.0-547* -y
  1. Register habana runtime by adding the following to /etc/docker/daemon.json:

{
        "runtimes": {
            "habana": {
                "path": "/usr/bin/habana-container-runtime",
                "runtimeArgs": []
            }
        }
}
  1. Restart Docker:

sudo systemctl restart docker

3.3.10. Update Environment Variables and More

When the installation is complete, close the shell and re-open it. Or, run the following:

source /etc/profile.d/habanalabs.sh

source ~/.bashrc

3.4. TensorFlow Installation

This section describes how to obtain and install the TensorFlow software package. Follow these instructions if you want to install the TensorFlow packages on a Bare Metal platform without a Docker image. The package consists of two main components:

  • Base habana-tensorflow Python package - Libraries and modules needed to execute TensorFlow on a single Gaudi device.

  • Scale-out habana-horovod Python package - Libraries and modules needed to execute TensorFlow on an HLS machine.

You can install TensorFlow on the Habana Gaudi device by:

  1. Using a pre-installed docker containing all the necessary dependencies,

  2. or, installing all the required components from scratch.

3.4.1. Installation with Docker

TensorFlow Docker contains both single and scale-out binaries and does not require additional installation steps. Installation with docker is the recommended installation method.

Before setting up the Docker environment, make sure the Firmware and Driver are set up on the host machine as previously outlined in this document. The following lists the prerequisites needed to install with docker:

  • Docker must be installed on the target machine.

  • Minimum Docker CE version required is 18.09.0.

For Docker install and setup details, refer to the Setup and Install GitHub page.

3.4.2. Setting Up the Environment from Scratch

To set up the environment, the SynapseAI software package must be installed first. Manually install the components listed in Package Content before installing the TensorFlow package.

The TensorFlow software package consists of two Python Packages. Installing both packages guarantees the same functionality delivered with TensorFlow Docker:

  • habana-tensorflow

  • habana-horovod

To execute TensorFlow on a single Gaudi device, install the habana-tensorflow package. To execute TensorFlow on an HLS machine, install the habana-horovod package.

3.4.2.1. Base Installation (Single Node)

The habana-tensorflow package contains all the binaries and scripts to run topologies on a single-node.

1. Before installing habana-tensorflow, install one of the versions listed in the TensorFlow section. If no TensorFlow package is available, PIP will automatically fetch the latest supported version.

# Either
python3 -m pip install tensorflow-cpu==2.5.0
# Or
python3 -m pip install tensorflow-cpu==2.4.1

2. habana-tensorflow is available in the Habana Vault. To allow PIP to search for the habana-tensorflow package, configure PIP:

python3 -m pip config --user set global.index https://vault.habana.ai/artifactory/api/pypi/gaudi-python
python3 -m pip config --user set global.index-url https://vault.habana.ai/artifactory/api/pypi/gaudi-python/simple
python3 -m pip config --user set global.trusted-host vault.habana.ai

Note

URLs to Habana Vault require credentials.

  1. Once PIP is configured, run:

python3 -m pip install habana-tensorflow
  1. Run the below command to make sure the habana-tensorflow package is properly installed:

python3 -c "import habana_frameworks.tensorflow as htf; print(htf.__version__)"

If everything is set up properly, the above command will print the currently installed package version.

Note

habana-tensorflow wheel contains libraries for all supported TensorFlow versions. There are three packages delivered for the following Python versions 3.6, 3.7, 3.8 (same as TensorFlow). Wheel is delivered under Linux tag, but the package is compatible with manylinux2010 tag (same as TensorFlow). Custom internal lib structure prevents tagging manylinux2010. Re-tagging manylinux2010 will be supported in a subsequent releases.

3.4.2.2. Scale-out Installation

Install the habana-horovod package to get multi-node support. The following lists the prerequisites for installing this package:

  • OpenMPI 4.0.5.

  • Stock horovod package must not be installed.

  1. Set up the OpenMPI 4.0.5 as shown below:

wget https://download.open-mpi.org/release/open-mpi/v4.0/openmpi-4.0.5.tar.gz
gunzip -c openmpi-4.0.5.tar.gz | tar xf -
cd openmpi-4.0.5/ && ./configure --prefix=/usr/local/share/openmpi
make -j 8 && make install && touch ~root/openmpi-4.0.5_installed
cp LICENSE /usr/local/share/openmpi/

# Necessary env flags to install habana-horovod module
export MPI_ROOT=/usr/local/share/openmpi
export LD_LIBRARY_PATH=$MPI_ROOT/lib:$LD_LIBRARY_PATH
export OPAL_PREFIX=$MPI_ROOT
export PATH=$MPI_ROOT/bin:$PATH

2. habana-horovod is also stored in the Habana Vault. To allow PIP to search for the habana-horovod package, configure PIP:

python3 -m pip config --user set global.index https://vault.habana.ai/artifactory/api/pypi/gaudi-python
python3 -m pip config --user set global.index-url https://vault.habana.ai/artifactory/api/pypi/gaudi-python/simple
python3 -m pip config --user set global.trusted-host vault.habana.ai

Note

URLs to Habana Vault require credentials.

  1. Install habana-horovod:

python3 -m pip install habana-horovod

See also

To learn more about the TensorFlow distributed training on Gaudi, see Distributed Training with TensorFlow.

3.4.3. Test for a Successful Installation

  1. To test the installation ran successfully, use the following command:

lsmod | grep habana

After running the command, a driver called habanalabs should be displayed.

habanalabs 1204224 2

2. To ensure best performance of HLS-1 machines, check CPU mode on the host:

cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
  1. If you see powersave, execute the following:

echo performance | sudo
tee/sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
  1. Then, verify CPU mode:

cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor