hl_qual Common Plugin Switches and Parameters

The hl_qual is a command line based tool where each test variant is run from a command line terminal. Passing parameters to the hl_qual and the other test plugins is done through command line switches and parameters.

The hl_qual and test plugins switches and parameters are partitioned into two groups:

  • Common switches and parameters - These parameters are identical for all test plugins.

  • Test plugins specific switches and parameters - These are unique parameters for each test plugin.

The following lists of switches and parameters are applicable to all test plugins. The below sections further describe all switches and parameters according to the test plugins outlined in hl_qual Design.

Note

After installation of hl_qual, the following commands must be set:

  • sudo chmod 777 /opt/habanalabs/qual -R

  • environment variable is set: __python_cmd=python3

-gaudi -gaudi2 -c <all or PCI bus id list> -rmod <serial | parallel>  [-dis_mon] [-mon_cfg <monitor INI path>]
      [<plugin INI config path>] [-enable_serr] [-dmesg] [-disable_pipe_red] [-skip_aer_detection]

Note

All the applicable switches are shown in the examples above; optional switches or parameters are placed within square brackets.

Switch Type

Switch name

Description

Device identification switch

-gaudi or -gaudi2

Indicates that a first-gen Gaudi or Gaudi 2 device should be detected and used for testing.

./hl_qual -gaudi|-gaudi2 -c all -rmod parallel -t 20 -f2
 - l extreme

Notes:

  • Choosing -gaudi or -gaudi2 is mandatory.

  • You can specify only one device switch in a single hl_qual command line.

  • If a test is not supported by a device, it will be clearly indicated in both the command line description and the corresponding section title.

Disable Monitor Screen Printout Switch

-dis_mon

Stops the monitor printout to the screen. This option is useful when running hl_qual inside a script.

./hl_qual -gaudi -dis_mon -c all -rmod parallel -t 20 -f2
- l extreme

Note: Various data values, including power, clock, and temperature, are still performed but the results are not displayed on your screen.

Changing Monitor configuration INI File Switch

-mon_cfg <path to monitor config file>

Enables using a different monitor configuration file instead of the default monitor.ini.

./hl_qual -gaudi -c all -mon_cfg my_mon.ini -rmod serial
 -t 20 -p -b

Device PCI bus id Identification Switch

-c <pci bus id>

Allows specifying Gaudi devices under test bus ids. There are two applicable formats:

  • all - All operational devices for the tested server must be aligned with the definition Device Identification Switch, for example:

    ./hl_qual  -gaudi -c all -rmod serial -t 20 -b -p
    
  • comma delimited bus ID list - All bus ids in the list Device Identification Switch, for example:

    ./hl_qual -gaudi -c 0000:07:00.0,0000:08:00.0 -rmod serial
     -t 20 -p -b
    

Note: Avoid using spaces between the comma and the bus ID strings in the comma delimited bus ID list.

Test Running Mode

-rmod <running mode>

Specifies the running mode on the available Gaudi devices. There are two applicable modes:

  • parallel- The plugin under test runs on all available devices at the same time, for example:

    ./hl_qual -gaudi -c all -rmod parallel -t 20 -f2 -l extreme
    
  • serial- The plugin under test runs on one device at a time.

    ./hl_qual -gaudi -c all -rmod serial -t 20 -f2 -l extreme
    

Enable Memory Error Monitoring Switch

-enable_serr

Enables hl_qual SERR/DERR counter check, which verifies that no single ECC error or double error occurs while running the plugins. hl_qual reads the memory error indication via HLML library.

./hl_qual -gaudi -dis_mon -c all -rmod parallel -t 20
-f2 -l extreme -enable_serr

Disable Standard Output Pipelining Switch

-disable_pipe_red

Disables the runner standard output redirection to the message pipe. This is significant when using logs directed to the screen or any other debug printouts. The standard output redirection is automatically disabled by default when console logs are enabled. For further information, refer to hl_qual Failure Debug.

./hl_qual -gaudi -dis_mon -c all -rmod parallel -t 20
-f2 -l extreme -disable_pipe_red

Enable Concatenation of dmesg Log

-dmesg

Appends the dmesg report to hl_qual report. The dmesg report is accumulated for the duration of the test.

./hl_qual -gaudi -dis_mon -c all -rmod parallel -t 20 -f2
-l extreme -dmesg

Disable AER test runs

-skip_aer_detection

AER test runs an AER readout application that reads all error bits indication that occur during the last test.

./hl_qual -gaudi -dis_mon -c all -rmod parallel -t 20 -f2
-l extreme -skip_aer_detection

Getting Help

The following command line prints out a usage help message on screen:

./hl_qual -h

The message includes specific hl_qual switches as well as the switches of all available and loaded plugins.

../../_images/help_message.jpg

Figure 7 hl_qual and Plugin Usage Printout