Supported Data Types
On this Page
Supported Data Types¶
The language supports the scalar and vector data types outlined in the below sections.
Built-in Scalar¶
The table below describes the list of supported scalar data types.
Type |
Description |
Generation |
_Bool (bool in C++) |
A conditional data type that is either true or false. The value true expands to the integer constant 1 and the value false expands to the integer constant 0. |
> Gen2 |
char |
A signed two’s complement 8-bit integer. |
> Gen2 |
unsigned char |
Unsigned 8-bit integer. |
> Gen2 |
short |
A signed two’s complement 16-bit integer. |
> Gen2 |
unsigned short |
An unsigned 16-bit integer. |
> Gen2 |
int |
A signed two’s complement 32-bit integer. |
> Gen2 |
unsigned int |
An unsigned 32-bit integer. |
> Gen2 |
float |
A 32-bit floating-point. The float data type must conform to the IEEE 754 single-precision storage format. |
> Gen2 |
_Bfloat16, bfloat, bf16 |
A 16-bit floating-point: 1 sign bit, 8 exponent bits and 7 mantissa bits. The interpretation of the sign, exponent and mantissa is analogous to IEEE 754 floating-point numbers. |
> Gen2 |
int x = 5; // allowed
float a = 1.0f; // allowed
_Bool b0, b1 = true; // allowed
b0 = x; // allowed
b1 = x > 0 ? a : a – 1.0f; //allowed
TPC floating point operations are compliant to IEEE 754 2008 except for the following:
Any sub-normal number is treated as zero.
Output NaN value is always 0x7FFFFFFF regardless of the input NaN value.
_Bfloat16¶
The _Bfloat16 data type as defined for the Gaudi architecture represents 16-bit floating-point numbers. _Bfloat16 numbers have 1 sign bit, 8 exponent bits, and 7 mantissa bits. The interpretation of the sign, exponent and mantissa is analogous to IEEE 754 floating-point numbers. The exponent bias is 127. The _Bfloat16 data type must represent finite and normal numbers, denormalized numbers, infinities and NaN. Conversions from _Bfloat16 to float are lossless; all _Bfloat16 numbers are exactly representable as float values. By default, conversions from float to _Bfloat16 either correctly round the mantissa to 8 bits of precision or reduce the float value up to 8 bits. This operation uses the corresponding conversion intrinsics. One can control rounding/reduction by setting the required rounding mode either before running the program or using the appropriate intrinsic in the program. See details in Main ABI.
Bfloat16 data type variables can be used in all arithmetic and logical expressions as any other floating data types. The usual arithmetic conversion is used when different data types take the above comments into consideration. See details in Conversions and Type Casting and the examples below:
Examples:
float eps_32 = 1.1920928955e-07f; // eps32 =
_Bfloat16 eps_16 = 1.1920928955e-07f; // eps16 =
_Bfloat16 x = 2 \* 0.5.f; // allowed
_Bfloat16 y1 = x + eps32; // y1 =
_Bfloat16 y2 = x + eps16; // y2 =
Built-in Vector¶
The TPC platform only supports 256 byte fixed size vector types. It supports the _Bool, char, unsigned char, short, unsigned short, integer, unsigned integer, float, _Bfloat16 vector data types. The vector data type is defined with the type name i.e. char, uchar, short, ushort, int, uint, float, bfloat, followed by a literal value n that defines the number of elements in the vector. The table below describes the list of the built-in vector data types.
Type |
Description |
Generation |
bool256 |
Type used to represent 256 bit values. This type supports only logical operations. |
> Gen2 |
char256 |
256-element vector; each element is a 1-byte signed integer value. |
> Gen2 |
uchar256 |
256-element vector; each element is a 1-byte unsigned integer value. |
> Gen2 |
short128 |
128-element vector; each element is a 2-byte signed integer value. |
> Gen2 |
ushort128 |
128-element vector; each element is a 2-byte unsigned integer value. |
> Gen2 |
int64 |
64-element vector; each element is a 4-byte signed integer value. |
> Gen2 |
uint64 |
64-element vector; each element is a 4-byte unsigned integer value. |
> Gen2 |
float64 |
64-element vector; each element is a 4-byte single precision value. |
> Gen2 |
bfloat128 |
128-element vector; each element is a 2-byte _Bfloat16 value. |
> Gen2 |
bfloat128 v0,v1,v2,v3; // allowed
int64 x0 = v_i32_ld_tnsr_i(coords, src); // allowed
char64 a; // not allowed, no such type
Built-in Aggregate¶
The TPC-C programming language pre-defines a set of built-in aggregate data types defined as structures. These data types mostly represent a pair of vectors of the same size and defined as follows:
<type_1>_<type_2>_pair_t
Aggregate data types mainly define output of intrinsics tuples in 2 and 4 vectors combining multiple data types. The table below lists all the available aggregate data types:
Type |
Fields |
Generation |
float64_float64_pair_t, float64_pair_t |
float_64 v1, float64 v2 |
> Gen2 |
float128 (Gaudi only) |
float_64 v1, float64 v2 |
> Gen2 |
float64_int64_pair_t |
float64 v1, int64 v2 |
> Gen2 |
float64_uint64_pair_t |
float64 v1, uint64 v2 |
> Gen2 |
int128, int64_pair_t, int64_int64_pair_t |
int64 v1, int64 v2. |
> Gen2 |
Int64_uint64_pair_t |
int64 v1, uint64 v2. |
> Gen2 |
Uint64_float64_pair_t |
uint64 v1, float64 v2. |
> Gen2 |
Uint64_int64_pair_t |
uint64 v1, int64 v2. |
> Gen2 |
Uint128, uint64_pair_t, uint64_uint64_pair_t |
uint64 v1, uint64 v2. |
> Gen2 |
Bfloat128_bfloat128_pair_t, bfloat128_pair_t (Gaudi only) |
bfloat128 v1, bfloat128 v2 |
> Gen2 |
bfloat128_short128_pair_t (Gaudi only) |
bfloat128 v1, short128 v2 |
> Gen2 |
bfloat128_ushort128_pair_t (Gaudi only) |
bfloat128 v1, ushort128 v2 |
> Gen2 |
short128_bfloat128_pair_t (Gaudi only) |
short128 v1, bfloat128 v2 |
> Gen2 |
ushort128_bfloat128_pair_t (Gaudi only) |
ushort128 v1, bfloat128 v2 |
> Gen2 |
short128_ushort128_pair_t |
ushort128 v1, ushort128 v2 |
> Gen2 |
short128_short128_pair_t, short128_pair_t |
short128 v1, short128 v2 |
> Gen2 |
ushort128_short128_pair_t |
ushort128 v1, short128 v2 |
> Gen2 |
ushort128_ushort128_pair_t, ushort128_pair_t |
ushort128 v1, ushort128 v2 |
> Gen2 |
char256_char256_pair_t, char256_pair_t |
char256 v1, char256 v2 |
> Gen2 |
char256_uchar256_pair_t |
char256 v1, uchar256 v2 |
> Gen2 |
uchar256_char256_pair_t |
uchar256 v1, char256 v2 |
> Gen2 |
uchar256_uchar256_pair_t, uchar256_pair_t |
uchar256 v1, uchar256 v2 |
> Gen2 |
uint32_t_pair_t |
unsigned int v1, unsigned int v2 |
> Gen2 |
uint16_t_pair_t |
unsigned short v1, unsigned short v2 |
> Gen2 |
uint8_t_pair_t |
unsigned char v1, unsigned char v2 |
> Gen2 |
int256 |
int64 v1, int64 v2, int64 v3, int64 v4 |
> Gen2 |
uint256 |
uint64 v1, uint64 v2, uint64 v3, uint64 v4 |
> Gen2 |
uint64_float64_pair_t sin_coeffs_tab =
v_f32_get_lut_entry_and_interval_start_v(x, coeffs,
e_func_variant_sin_cos);
char256 x0,x1;
int256 t;
int256 t = av_i8_mac_v_v(x0,x1,t,0);
Vector Literals and Vector Initialization¶
TPC-C allows using vector literals neither in expressions nor for initialization. A vector can be initialized only by a scalar or vector expression. According to the C standard rules for function evaluation, first, the value of the expression will be calculated, then it will be converted to the basic type of the vector (see Usual Arithmetic Conversions) and the result is assigned (broadcasted) to all vector components. Initialization of a vector by vector literals written as a parenthesized list of parameters is not allowed.
When initializing a bool256 vector, the vector components are set to -1
(i.e. all bits set) if the initializing _Bool value is true
, and set to 0
otherwise. Any other initializing value is illegal. See the examples below:
Examples:
int64 int_vec = {0, 1, 2}; // not allowed
int64 a = 1; // allowed
float64 unit_vec = 1.0f; // allowed
float64 y = {1.f, 2.f}; // not allowed
short128 x = 67 + 1.f/31; // not allowed
bool256 p = 1; // allowed, all bits are 1
bool256 p = 0; // allowed, all bits are 0
bool256 p = 7; // not allowed
Vector Components¶
A vector is considered as an opaque object with no possibility to access its individual elements. It is impossible to set or get a vector component by its index in the vector. See the examples below:
Examples:
char 256 x;
char x0 =x[0]; // not allowed
int5 ind;
int width = ind[1]; // allowed
Other Built-in Data Types¶
Built-in Vector int5¶
The int5 data type defines 5-element tuples representing cartesian coordinates pointing into a tensor. Each element of an int5 vector is a 4-byte signed integer value. Initializing int5 vectors as ordinary C arrays and set/get their elements by indexes is allowed. Elements of int5 vectors can be used in expressions as signed int values. All arithmetic operations are supported for whole int5 vectors. See the examples below:
Examples:
int5 start = get_index_space_offset();
int5 end = start + get_index_space_size();
int start_0 = start[0]; // allowed
int5 ind = {3, 3, 3, 2, 1}; // allowed
int width = ind[1]; // allowed
Swizzle expressions ca permutate elements of int5 vector. See the examples below:
Examples:
int5 v1 = {0, 1, 2, 3, 4};
int5 v2 = v1.xyzwq; // Identical to v1
int5 v3 = v1.qwxxy; // Produces {4, 3, 0, 0, 1}
Structures and Unions¶
You can define structures and unions according to the standard C rules. At the same time, combining the field data types is limited – they can’t be from different memory spaces (see Explicit Conversion Examples). It is illegal to define a structure or union with scalar and vector fields, or vector and global pointer fields, or vector and int5 fields. See more examples below:
Examples:
struct Vec_struct { // allowed
int64 int_vec;
float64 float_vec;
};
struct Vec_struct { // not allowed
int vec_length;
float64 float_vec;
};
struct Index_struct { // allowed
int5 start;
int5 end;
};
struct Index_struct { // not allowed
int5 start;
int64 int_vec;
};
struct Index_struct { // allowed
int5 start;
int start_0;
};
struct Vec_point_struct { // not allowed
__global__ float \* global_vec_ptr;
float64 float_vec;
};
Alignment of Types¶
Any object in memory is always aligned. Objects of scalar types, int5 and derived types are allocated in scalar local memory, thus are aligned at 4 byte boundary. Built-in vector types are aligned at 256 byte boundary. Due to this limitation, arrays of scalar short types are not allowed. The host environment determines alignment in global memory.
The TPC compiler aligns data items to the appropriate alignment as required by the data type. The behavior of an unaligned load or store is undefined. The vector load intrinsics can read a vector from an address aligned to the element type of the vector. The vector store intrinsics can write a vector to an address aligned to the element type of the vector.
Keywords¶
The following names are reserved for use as keywords in TPC-C and cannot be used otherwise.
Names reserved as keywords by C99.
TPC-C data types defined in Table 1, Table 2, Table 3 and Other Built-in Data Types.
Address space qualifiers:
__global
,__global__
,__local
and__local__
.The tensor keyword represents an opaque handle pointing to a tensor object. It can only be used to declare parameters of main program entry point. ‘tensor’ objects are essentially integer identifiers. No any ‘tensor’ objects can be defined within the program. An example of using this type is found in Main ABI.