Supported Data Types

The language supports the scalar and vector data types outlined in the below sections.

Built-in Scalar

The table below describes the list of supported scalar data types.




_Bool (bool in C++)

A conditional data type that is either true or false. The value true expands to the integer constant 1 and the value false expands to the integer constant 0.

> Gen2


A signed two’s complement 8-bit integer.

> Gen2

unsigned char

Unsigned 8-bit integer.

> Gen2


A signed two’s complement 16-bit integer.

> Gen2

unsigned short

An unsigned 16-bit integer.

> Gen2


A signed two’s complement 32-bit integer.

> Gen2

unsigned int

An unsigned 32-bit integer.

> Gen2


A 32-bit floating-point. The float data type must conform to the IEEE 754 single-precision storage format.

> Gen2

_Bfloat16, bfloat, bf16

A 16-bit floating-point: 1 sign bit, 8 exponent bits and 7 mantissa bits. The interpretation of the sign, exponent and mantissa is analogous to IEEE 754 floating-point numbers.

> Gen2

int x = 5; // allowed

float a = 1.0f; // allowed

_Bool b0, b1 = true; // allowed

b0 = x; // allowed

b1 = x > 0 ? a : a – 1.0f; //allowed

TPC floating point operations are compliant to IEEE 754 2008 except for the following:

  • Any sub-normal number is treated as zero.

  • Output NaN value is always 0x7FFFFFFF regardless of the input NaN value.


The _Bfloat16 data type as defined for the Gaudi architecture represents 16-bit floating-point numbers. _Bfloat16 numbers have 1 sign bit, 8 exponent bits, and 7 mantissa bits. The interpretation of the sign, exponent and mantissa is analogous to IEEE 754 floating-point numbers. The exponent bias is 127. The _Bfloat16 data type must represent finite and normal numbers, denormalized numbers, infinities and NaN. Conversions from _Bfloat16 to float are lossless; all _Bfloat16 numbers are exactly representable as float values. By default, conversions from float to _Bfloat16 either correctly round the mantissa to 8 bits of precision or reduce the float value up to 8 bits. This operation uses the corresponding conversion intrinsics. One can control rounding/reduction by setting the required rounding mode either before running the program or using the appropriate intrinsic in the program. See details in Main ABI.

Bfloat16 data type variables can be used in all arithmetic and logical expressions as any other floating data types. The usual arithmetic conversion is used when different data types take the above comments into consideration. See details in Conversions and Type Casting and the examples below:

Built-in Vector

The TPC platform only supports 256 byte fixed size vector types. It supports the _Bool, char, unsigned char, short, unsigned short, integer, unsigned integer, float, _Bfloat16 vector data types. The vector data type is defined with the type name i.e. char, uchar, short, ushort, int, uint, float, bfloat, followed by a literal value n that defines the number of elements in the vector. The table below describes the list of the built-in vector data types.





Type used to represent 256 bit values. This type supports only logical operations.

> Gen2


256-element vector; each element is a 1-byte signed integer value.

> Gen2


256-element vector; each element is a 1-byte unsigned integer value.

> Gen2


128-element vector; each element is a 2-byte signed integer value.

> Gen2


128-element vector; each element is a 2-byte unsigned integer value.

> Gen2


64-element vector; each element is a 4-byte signed integer value.

> Gen2


64-element vector; each element is a 4-byte unsigned integer value.

> Gen2


64-element vector; each element is a 4-byte single precision value.

> Gen2


128-element vector; each element is a 2-byte _Bfloat16 value.

> Gen2

bfloat128 v0,v1,v2,v3; // allowed

int64 x0 = v_i32_ld_tnsr_i(coords, src); // allowed

char64 a; // not allowed, no such type

Built-in Aggregate

The TPC-C programming language pre-defines a set of built-in aggregate data types defined as structures. These data types mostly represent a pair of vectors of the same size and defined as follows:


Aggregate data types mainly define output of intrinsics tuples in 2 and 4 vectors combining multiple data types. The table below lists all the available aggregate data types:




float64_float64_pair_t, float64_pair_t

float_64 v1, float64 v2

> Gen2

float128 (Gaudi only)

float_64 v1, float64 v2

> Gen2


float64 v1, int64 v2

> Gen2


float64 v1, uint64 v2

> Gen2

int128, int64_pair_t, int64_int64_pair_t

int64 v1, int64 v2.

> Gen2


int64 v1, uint64 v2.

> Gen2


uint64 v1, float64 v2.

> Gen2


uint64 v1, int64 v2.

> Gen2

Uint128, uint64_pair_t, uint64_uint64_pair_t

uint64 v1, uint64 v2.

> Gen2

Bfloat128_bfloat128_pair_t, bfloat128_pair_t (Gaudi only)

bfloat128 v1, bfloat128 v2

> Gen2

bfloat128_short128_pair_t (Gaudi only)

bfloat128 v1, short128 v2

> Gen2

bfloat128_ushort128_pair_t (Gaudi only)

bfloat128 v1, ushort128 v2

> Gen2

short128_bfloat128_pair_t (Gaudi only)

short128 v1, bfloat128 v2

> Gen2

ushort128_bfloat128_pair_t (Gaudi only)

ushort128 v1, bfloat128 v2

> Gen2


ushort128 v1, ushort128 v2

> Gen2

short128_short128_pair_t, short128_pair_t

short128 v1, short128 v2

> Gen2


ushort128 v1, short128 v2

> Gen2

ushort128_ushort128_pair_t, ushort128_pair_t

ushort128 v1, ushort128 v2

> Gen2

char256_char256_pair_t, char256_pair_t

char256 v1, char256 v2

> Gen2


char256 v1, uchar256 v2

> Gen2


uchar256 v1, char256 v2

> Gen2

uchar256_uchar256_pair_t, uchar256_pair_t

uchar256 v1, uchar256 v2

> Gen2


unsigned int v1, unsigned int v2

> Gen2


unsigned short v1, unsigned short v2

> Gen2


unsigned char v1, unsigned char v2

> Gen2


int64 v1, int64 v2, int64 v3, int64 v4

> Gen2


uint64 v1, uint64 v2, uint64 v3, uint64 v4

> Gen2

uint64_float64_pair_t sin_coeffs_tab =

v_f32_get_lut_entry_and_interval_start_v(x, coeffs,

char256 x0,x1;

int256 t;

int256 t = av_i8_mac_v_v(x0,x1,t,0);

Vector Literals and Vector Initialization

TPC-C allows using vector literals neither in expressions nor for initialization. A vector can be initialized only by a scalar or vector expression. According to the C standard rules for function evaluation, first, the value of the expression will be calculated, then it will be converted to the basic type of the vector (see Usual Arithmetic Conversions) and the result is assigned (broadcasted) to all vector components. Initialization of a vector by vector literals written as a parenthesized list of parameters is not allowed.

When initializing a bool256 vector, the vector components are set to -1 (i.e. all bits set) if the initializing _Bool value is true, and set to 0 otherwise. Any other initializing value is illegal. See the examples below:

Vector Components

A vector is considered as an opaque object with no possibility to access its individual elements. It is impossible to set or get a vector component by its index in the vector. See the examples below:

Other Built-in Data Types

Built-in Vector int5

The int5 data type defines 5-element tuples representing cartesian coordinates pointing into a tensor. Each element of an int5 vector is a 4-byte signed integer value. Initializing int5 vectors as ordinary C arrays and set/get their elements by indexes is allowed. Elements of int5 vectors can be used in expressions as signed int values. All arithmetic operations are supported for whole int5 vectors. See the examples below:

Swizzle expressions ca permutate elements of int5 vector. See the examples below:

Structures and Unions

You can define structures and unions according to the standard C rules. At the same time, combining the field data types is limited – they can’t be from different memory spaces (see Explicit Conversion Examples). It is illegal to define a structure or union with scalar and vector fields, or vector and global pointer fields, or vector and int5 fields. See more examples below:

Alignment of Types

Any object in memory is always aligned. Objects of scalar types, int5 and derived types are allocated in scalar local memory, thus are aligned at 4 byte boundary. Built-in vector types are aligned at 256 byte boundary. Due to this limitation, arrays of scalar short types are not allowed. The host environment determines alignment in global memory.

The TPC compiler aligns data items to the appropriate alignment as required by the data type. The behavior of an unaligned load or store is undefined. The vector load intrinsics can read a vector from an address aligned to the element type of the vector. The vector store intrinsics can write a vector to an address aligned to the element type of the vector.


The following names are reserved for use as keywords in TPC-C and cannot be used otherwise.

  • Names reserved as keywords by C99.

  • TPC-C data types defined in Table 1, Table 2, Table 3 and Other Built-in Data Types.

  • Address space qualifiers: __global, __global__, __local and __local__.

  • The tensor keyword represents an opaque handle pointing to a tensor object. It can only be used to declare parameters of main program entry point. ‘tensor’ objects are essentially integer identifiers. No any ‘tensor’ objects can be defined within the program. An example of using this type is found in Main ABI.