hl_qual Report Structure

Overview

The hl_qual generates a test report composed of few sub reports. The name of the report file includes:

  • Tested server name.

  • The string: hl_qual_report.

  • Time stap including date and time.

For example, k501-u18-001-dev_hl_qual_report_Sat_Dec_4_09-15-16_2021.log.

Device Identification Report

This report contains the PCI bus ID of all identified devices according to the devices switch entered, for example: -gaudi. It contains device status reports that verify if the device is in operation state. If the the hl_qual finds that a certain device is not in operation state, the test will not be executed.

../../_images/device_indentification_report.PNG

Figure 28 Device Identification Report

HL-SMI Short Report

This report contains an IDLE power/utilization information and mapping between PCI bus ID and index/module_id/serial/driver_version. See the figure below:

../../_images/hl_smi_short_report.JPG

Figure 29 HL-SMI short report

If the test reports device utilization issues or bad device memory information, this device will be removed from the test device list and a proper message will be added in this section.

NUMA Node Report

The report contains the identified NUMA nodes, CPU sets and allocation of Habana devices per NUMA node. If the tested server contains a single NUMA node, the NUMA node allocation considerations in CPU to device allocation will not exist.

Note

When running on a virtual machine, usually the NUMA node data is not reflected correctly between the bare-metal machine and the VM.

../../_images/numa_node_device_allocation.PNG

Figure 30 NUMA Node Report

HL-QUAL version and Command Line Report

Reports the hl_qual package version and specifies the command line that used.

../../_images/command_line_report.JPG

Figure 31 Command Line Report

Tested Device Report

The report contains the following information data:

  • The specific data of the device: serial number, PCB assembly version, device name.

  • The time the test starts and stops.

  • Internal test plugin data accumulated during the test run, such as pass/fail data, general test stages.

../../_images/device_test_data.JPG

Figure 32 Device Test Report

Closing Report

The report contains the following items:

  • General statistics and metrics report gathered during the duration of the test, such as power usage, clocks and temperature.

  • Pass/fail report per tested device.

  • General pass/fail report.

../../_images/closing_report.JPG

Figure 33 Closing Report