1. TPC Tools Installation Guide

1.1. Introduction

Habana tools is a .deb/.rpm package containing all the necessary tools required to build Tensor Processor Core™ (TPC™) kernel plugins for the Graph Compiler. The package includes a TPC-C compiler, assembler, dis-assembler and all necessary headers. The library can be installed on any development machine and does not require the availability of Habana hardware.

For more details on the TPC SDK, refer to the following:

1.1.1. Package Prerequisites

The package depends on the following libraries:

  • gcc

  • gcc-c++

  • cmake3 > 3.5

  • boost-devel


Review the Habana TPC Tools Debugger document for details about the IDE plugin for TPC code development and debugging.

1.2. Installation

1.2.1. RPM Installation for CentOS 7.5 (Desktop Configuration)

Launching the Eclipse GUI IDE requires a CentOS 7.5 Desktop configuration.

Type the following on the CentOS bash terminal:

sudo yum install epel-release
sudo yum install --enablerepo=epel cmake3
sudo ln -s /usr/bin/cmake3 /usr/bin/cmake
sudo yum install ./habanatools-<version>.x86_64.rpm

1.2.2. Deb Installation for Ubuntu 18.04 (Desktop Configuration)

Type the following on the Ubuntu bash terminal:

sudo dpkg -i ./habanatools_<version>_amd64.deb

1.3. Habana Tools Content

Once installed the following files are added to your machine:

Table 1: Content of Package





TPC-C compiler and assembler



TPC dis-assembler



TPC simulator



Test core library



Simulator headers



Glue code interface header



Available TPC-C intrinsics



Test core API

1.4. TPC-C Compiler Command Line Arguments

The TPC compiler is LLVM based and accepts standard LLVM command line arguments along with the following additions:

  • Optimization levels: -O2 and -O0 are max and min supported levels. -O2 is the default level. -O1 turns off HW loops and a few other optimizations. -O0 turns off instruction scheduling and bundling, and pads all instructions with 6 NOPs to ensure the results queue is committed to the register file before the next instruction is executed.

  • -march=<name> - Architecture switch, currently supported name is “dali” (AKA Goya).

  • -max-tensors <n> - Tensor limit. n is a number in range 0..8. Default is 8.

  • -vlm <n> - Vector local memory limit. n is the size of vector local memory in KB. Default is 80.

  • -main-function <main_entry_name> - Name of entry function. Default is “main”.

  • -all-loops-taken - Enables global elimination of loop end padding as loops are always taken. This can improve performance when the developer commits, all loops in the program are taken at least once.

  • -reg-mem-count - Prints to console usage of registers and local memory at compile time.

  • -disable-lut-warn - Suppress performance warning when used LUT size exceeds LUT cache.

  • -x c++ - Enables static c++ in TPC-C.

  • -o <file name> - Sets name of output file. (Standard LLVM argument).

1.4.1. Compiler usage example

The compiler supports a single translation unit, hence -c argument should be defined.

/usr/bin/tpc-clang reduction.c -c -x c++ -o reduction.o

The output of the compilation session is an ELF file named reduction.o. To extract raw binary from the ELF file, use the following command:

objcopy -O binary --only-section=.text reduction.o reduction.bin

1.5. Assembler Command Line Arguments

The assembler and compiler are merged into a single binary. The compiler uses the file suffix in order to decide if the apply C-language front-end or TPC assembler front-end, .tpcasm invokes the assembler.

/usr/bin/tpc-clang <input text file name>.tpcasm

    -c -o <output object file name>.o

1.6. Dis-assembler Command Line Argument

/usr/bin/llvm-objdump --triple tpc -d -j .text -no-show-raw-insn
-no-leading-addr -mcpu=<goya|gaudi> <input object file to dis-assemble>.o

The dis-assembly is printed to the standard console.