hl_qual Report Structure
On this Page
hl_qual Report Structure¶
Overview¶
The hl_qual generates a test report composed of sub reports. The name of the report file includes:
Tested server name
The string: hl_qual_report
Time stamp including date and time
For example, k501-u18-001-dev_hl_qual_report_Sat_Dec_4_09-15-16_2021.log
.
hl-qual reports and log files are printed to a directory that is determined by
the $HABANA_LOGS
environment variable, using $HABANA_LOGS/qual
path.
If HABANA_LOGS
is not defined, hl-qual will set it locally
to /var/log/habana_logs
and redirect the file printout to /var/log/habana_logs/qual
.
Device Identification Report¶
This report contains the PCI bus ID of all identified devices according to the devices switch entered, (for example, -gaudi
).
It contains device status reports that verify if the device is in operational state. If hl_qual finds that a certain device
is not in operation state, the test will not be executed.
Figure 12 Device Identification Report¶
hl-smi Short Report¶
The hl-smi report provides an identification card for all available devices including their bus_id, serial number, device index, module ID and device type.
Figure 13 HL-SMI short report¶
Operational Status Report¶
The operational status report contains the results of the operational test conducted on all detected Habana devices within the system. A device will fail the test if it does not meet the following criteria:
Memory usage exceeds the idle time memory usage threshold.
The operational indication, as set by the Habana Linux kernel driver, is either unavailable or indicates that the device is not operational.
Figure 14 HL-SMI Short Report¶
NUMA Node Report¶
The report contains the identified NUMA nodes, CPU sets and allocation of Habana devices per NUMA node. If the tested server contains a single NUMA node, the NUMA node allocation considerations in CPU to device allocation will not exist.
Note
When running on a virtual machine, the NUMA node data is usually not reflected correctly between the bare-metal machine and the VM.
Figure 15 NUMA Node Report¶
Hl-qual Version and Command Line Report¶
Reports the hl_qual package version and specifies the command line used:
Figure 16 Command Line Report¶
Tested Device Report¶
The report contains the following information:
The specific data of the device: serial number, PCB assembly version, device name.
The time the test starts and stops.
Internal test plugin data accumulated during the test run, such as pass/fail data, general test stages.
Figure 17 Device Test Report¶