C APIs

All APIs, structures and enums used are defined in the public header file. APIs that provide information about a specific device require a handle as a parameter. This handle can be retrieved using the device’s index or PCI address.

The return value of all the APIs is the following enum:

typedef enum hlml_return {

    HLML_SUCCESS = 0,

    HLML_ERROR_UNINITIALIZED = 1,

    HLML_ERROR_INVALID_ARGUMENT = 2,

    HLML_ERROR_NOT_SUPPORTED = 3,

    HLML_ERROR_ALREADY_INITIALIZED = 5,

    HLML_ERROR_NOT_FOUND = 6,

    HLML_ERROR_INSUFFICIENT_SIZE = 7,

    HLML_ERROR_DRIVER_NOT_LOADED = 9,

    HLML_ERROR_TIMEOUT = 10,

    HLML_ERROR_AIP_IS_LOST = 15,

    HLML_ERROR_MEMORY = 20,

    HLML_ERROR_NO_DATA = 21,

    HLML_ERROR_UNKNOWN = 49,

} hlml_return_t;

typedef struct hlml_pci_cap

Description:

Capabilities of the PCI device.

Members:

Member

Description

link_speed

current pci link speed unit is an enumeration for GT/s (GigaTransfers per Second) rates. Where:

1 - 2.5GT/s 2 - 5GT/s 3 - 8GT/s 4 - 16GT/s 5 - 32GT/s

link_width

current pci link width unit is number of lanes. x1,x2,x4,x8,x12,x16 and x32.

link_max_speed

max available pci link speed GT/s as with link_speed.

link_max_width

max available pci link width unit is number of lanes.

typedef struct hlml_pci_info

Description:

Information about the PCI device.

Members:

Member

Description

bus

The bus on which the device resides, 0 to 0xf

bus_id

The tuple domain:bus:device.function

device

The device’s id on the bus, 0 to 31

domain

The PCI domain on which the device’s bus resides

pci_device_id

The combined 16b deviceId and 16b vendor id

pci_rev

The PCI revision, low byte of class word

pci_subsys_id

The combined 16b subsys_id and 16b subsys_vendor_id

typedef struct hlml_pci_info

Description:

Information about the PCI device.

Members:

Member

Description

bus

The bus on which the device resides, 0 to 0xf

bus_id

The tuple domain:bus:device.function

device

The device’s id on the bus, 0 to 31

domain

The PCI domain on which the device’s bus resides

pci_device_id

The combined 16b deviceId and 16b vendor id

pci_rev

The PCI revision, low byte of class word

pci_subsys_id

The combined 16b subsys_id and 16b subsys_vendor_id

typedef struct hlml_utilization

Description:

Utilization of the PCI device.

Members:

Member

Description

aip

OAM compute utilization. Can be more then 100%. (In Gaudi3 the granularity is only of 5%)

memory

The occupied memory as reported by the synapse framework. In case the framework is not responsive all the OAM memory will be reported as occupied, even if it is used only partially. When the OAM is not used, only minor part of its memory will be reported as occupied, used for internal OAM use.

typedef struct hlml_memory

Description:

OAM Memory Usage (in bytes). Memory usage information is retrieved using two methods:

  • Recommended method: Reading memory consumption data provided by the Synapse framework.

  • Fallback method: Using the driver if Synapse data is unavailable. Note that the driver’s information is very limited. It only reports two states:

    • Free: A small reserved amount may still be in use.

    • Total: All memory is occupied.

Members:

Member

Description

free

Free memory, calculated as (total - used). If the framework is not responsive, get the OAM’s free memory as reported by the driver.

total

Total installed memory.

used

OAM’s used memory as reported by the framework. If the framework is not responsive, it is calculated as (total - free).

typedef struct hlml_pcb_info

Description:

Printed Circuit Board (PCB) information of the device.

Members:

Member

Description

pcb_ver

The device’s PCB version (string).

pcb_assembly_ver

The device’s PCB assembly version (string).

typedef struct hlml_event_data

Description:

Data structure to describe OAM’s occurred events.

Members:

Member

Description

hlml_device_t

A specific device where the event occurred.

event_type

A specific event that occurred. For the full list of HLML_EVENT_* values, refer to hlml.h file.

typedef struct hlml_mac_info

Description:

Information about the MAC device. Each OAM contains multiple MAC devices.

Members:

Member

Description

addr

MAC ID (XX:XX:XX:XX:XX:XX).

id

MAC index in the array. The ID starts from 1.

typedef struct hlml_nic_stats_info

Description:

Information about the NIC (Network Interface Card) statistics.

Members:

Member

Description

port

Port number

str_buf

Internal use

val_buf

Reallocated list containing the values of the NIC port’s attributes.

num_of_counters_out

Number of received attributes’ values. The attributes are not fixed and depend on the NIC driver’s output.

typedef struct hlml_violation_time

Description:

Information about clock throttling duration.

Members:

Member

Description

reference_time

Clock throttle start timestamp (us).

violation_time

Clock throttle duration (ns).

typedef struct hlml_row_address

Description:

  • HBM Stack: A physical unit consisting of multiple pseudo channels, formed by vertically stacking several DRAM dies.

  • Pseudo Channel: Logical divisions within the stack that provide independent access paths.

  • Bank: Segments within each pseudo channel that enable parallel access.

  • Row: The smallest unit of data organization within a bank, accessed during memory operations.

Members:

Member

Description

hbm_idx

HBM device ID

pc

HBM pseudo channel

sid

HBM stack ID

bank_idx

HBM bank ID

row_addr

HBM bank’s row address

typedef struct hlml_aip_error_counters

Description:

Error counters for the AIP device.

Members:

Member

Description

err_counters

Counters for different types of information. The offset in err_counters is according to hlml_err_counter_idx

index

OAM index number

typedef struct hlml_process_utilization_sample

Description:

Power utilization values of an AIP device. Used as out by hlml_device_get_process_utilization.

Members:

Member

Description

aip_util

The computation utilization power in Watts