Intel Gaudi ReportNCheck Tool

This document provides guidelines for installing and running the Intel® Gaudi® ReportNCheck tool on Gaudi accelerators. The tool collects system, card, and NIC information, and verifies connectivity for both scale-up and scale-out configurations.

Note

  • The ReportNCheck tool is included and located under the RDMA PerfTest tool. Installing PerfTest automatically installs the ReportNCheck tool.

  • The tool is supported on Gaudi 3 and Gaudi 2, and can be used with their integrated NICs as well as host NICs.

  • The tool can be used either in a container, on a VM, or directly on a bare metal machine.

Prerequisites

Make sure you have the following installed:

  • pandas Python package

  • Intel® Gaudi® RDMA PerfTest tool. Refer to RDMA PerfTest tool section.

Options and Usage

Use the -h argument to view all options. The table below describes all the python3 reportNcheck.py -h options available.

Option

Description

-h, –help

Shows the help message and exits.

-c, –check

Checks Gaudi/host NIC connectivity.

-v

Reports detailed information about the server, Gaudi devices, and Gaudi/host NICs.

-q QUAD, –quad QUAD

Specifies the QUAD (1 or 2) within which Gaudi NIC connectivity will be tested on a PCIe server. By default, the tool tests Gaudi NICs in both QUADs.

-d DEVICE, –device DEVICE

Specifies the device and port (e.g., hbl_0:24 or mlx5_1:1) used for an internal Gaudi NIC or an external Gaudi/host NIC connectivity checks.

-d2 DEVICE2, –device2 DEVICE2

Specifies the second device and port (e.g., hbl_1:20) used for internal Gaudi NIC connectivity checks.

-s SERVER, –server SERVER

Specifies the server (e.g., sc09wynn01-hls2) usedf for external Gaudi/host NIC connectivity checks.

Usage Examples