Intel Gaudi ReportNCheck Tool
On this Page
Intel Gaudi ReportNCheck Tool¶
This document provides guidelines for installing and running the Intel® Gaudi® ReportNCheck tool on Gaudi accelerators. The tool collects system, card, and NIC information, and verifies connectivity for both scale-up and scale-out configurations.
Note
The ReportNCheck tool is included and located under the RDMA PerfTest tool. Installing PerfTest automatically installs the ReportNCheck tool.
The tool is supported on Gaudi 3 and Gaudi 2, and can be used with their integrated NICs as well as host NICs.
The tool can be used either in a container, on a VM, or directly on a bare metal machine.
Prerequisites¶
Make sure you have the following installed:
pandasPython packageIntel® Gaudi® RDMA PerfTest tool. Refer to RDMA PerfTest tool section.
Options and Usage¶
Use the -h argument to view all options.
The table below describes all the python3 reportNcheck.py -h options available.
Option |
Description |
|---|---|
-h, –help |
Shows the help message and exits. |
-c, –check |
Checks Gaudi/host NIC connectivity. |
-v |
Reports detailed information about the server, Gaudi devices, and Gaudi/host NICs. |
-q QUAD, –quad QUAD |
Specifies the QUAD (1 or 2) within which Gaudi NIC connectivity will be tested on a PCIe server. By default, the tool tests Gaudi NICs in both QUADs. |
-d DEVICE, –device DEVICE |
Specifies the device and port (e.g., hbl_0:24 or mlx5_1:1) used for an internal Gaudi NIC or an external Gaudi/host NIC connectivity checks. |
-d2 DEVICE2, –device2 DEVICE2 |
Specifies the second device and port (e.g., hbl_1:20) used for internal Gaudi NIC connectivity checks. |
-s SERVER, –server SERVER |
Specifies the server (e.g., sc09wynn01-hls2) usedf for external Gaudi/host NIC connectivity checks. |
Usage Examples¶
Example 1: python reportNcheck.py -h
python reportNcheck.py report brief info
python reportNcheck.py -v report detailed info
python reportNcheck.py -c automatically check internal Gaudi NIC connectivity
python reportNcheck.py -c -q 2 check Gaudi NIC connectivity of quad 2 on PCIe server
python reportNcheck.py -c -d hbl_1:10 -d2 hbl_6:10 manually specify device and port for internal Gaudi NIC connectivity check
manually specify device and port for external Gaudi NIC connectivity check:
Server: python reportNcheck.py -c -d hbl_3:23
Client: python reportNcheck.py -c -d hbl_7:9 -s SERVER_NAME
manually specify device and port for external Host NIC connectivity check:
Server: python reportNcheck.py -c -d mlx5_1:1
Client: python reportNcheck.py -c -d mlx5_0:1 -s SERVER_NAME
Example 2: :code:`python3 reportNcheck.py
Reports detailed information about the server, Gaudi devices, and Gaudi/host NICs.
sc09wynn02-hls2
index device(NICs) module_id bus_addr
0 hbl_0 2 0000:33:00.0
Port 9 ACTIVE enp51s0d8
Port 23 ACTIVE enp51s0d22 169.254.80.75/16 UP
Port 24 ACTIVE enp51s0d23
1 hbl_1 0 0000:4d:00.0
Port 9 ACTIVE enp77s0d8
Port 23 ACTIVE enp77s0d22
Port 24 ACTIVE enp77s0d23
2 hbl_2 1 0000:4e:00.0
Port 9 ACTIVE enp78s0d8
Port 23 ACTIVE enp78s0d22
Port 24 ACTIVE enp78s0d23
3 hbl_3 3 0000:34:00.0
Port 9 ACTIVE enp52s0d8
Port 23 ACTIVE enp52s0d22
Port 24 ACTIVE enp52s0d23
4 hbl_4 6 0000:9a:00.0
Port 9 ACTIVE enp154s0d8
Port 23 ACTIVE enp154s0d22
Port 24 ACTIVE enp154s0d23
5 hbl_5 7 0000:9b:00.0
Port 9 ACTIVE enp155s0d8
Port 23 ACTIVE enp155s0d22
Port 24 ACTIVE enp155s0d23
6 hbl_6 4 0000:b3:00.0
Port 9 ACTIVE enp179s0d8
Port 23 ACTIVE enp179s0d22
Port 24 ACTIVE enp179s0d23
7 hbl_7 5 0000:b4:00.0
Port 9 ACTIVE enp180s0d8
Port 23 ACTIVE enp180s0d22
Port 24 ACTIVE enp180s0d23
HostNIC mlx5_0 0000:65:00.0
Port 1 ens7f0np0 192.168.100.2/24 UP
HostNIC mlx5_1 0000:65:00.1
Port 1 ens7f1np1 172.26.47.90/22 UP
Example 3: :code:`python3 reportNcheck.py -v
Reports details of server, card, NICs, including GID, MAC and IP address.
Collecting info...
Gaudi2 OAM: sc09wynn02-hls2
index device(NICs) module_id bus_addr
0 hbl_0 2 0000:33:00.0
Port 1 ACTIVE
Port 2 ACTIVE
Port 3 ACTIVE
Port 4 ACTIVE
Port 5 ACTIVE
Port 6 ACTIVE
Port 7 ACTIVE
Port 8 ACTIVE
Port 9 ACTIVE enp51s0d8 b0:fd:0b:d6:12:c1 UP
GID 0: fe80:0000:0000:0000:b2fd:0bff:fed6:12c1, IB/RoCE v1
GID 1: fe80:0000:0000:0000:b2fd:0bff:fed6:12c1, RoCE v2
GID 2: fe80:0000:0000:0000:6d4d:aa05:438c:338d, IB/RoCE v1
GID 3: fe80:0000:0000:0000:6d4d:aa05:438c:338d, RoCE v2
Port 10 ACTIVE
Port 11 ACTIVE
Port 12 ACTIVE
Port 13 ACTIVE
Port 14 ACTIVE
Port 15 ACTIVE
Port 16 ACTIVE
Port 17 ACTIVE
Port 18 ACTIVE
Port 19 ACTIVE
Port 20 ACTIVE
Port 21 ACTIVE
Port 22 ACTIVE
Port 23 ACTIVE enp51s0d22 b0:fd:0b:d6:12:cf UP 169.254.80.75/16 UP
GID 0: fe80:0000:0000:0000:b2fd:0bff:fed6:12cf, IB/RoCE v1
GID 1: fe80:0000:0000:0000:b2fd:0bff:fed6:12cf, RoCE v2
GID 2: 0000:0000:0000:0000:0000:ffff:a9fe:504b, IB/RoCE v1
GID 3: 0000:0000:0000:0000:0000:ffff:a9fe:504b, RoCE v2
Port 24 ACTIVE enp51s0d23 b0:fd:0b:d6:12:d0 UP
GID 0: fe80:0000:0000:0000:b2fd:0bff:fed6:12d0, IB/RoCE v1
GID 1: fe80:0000:0000:0000:b2fd:0bff:fed6:12d0, RoCE v2
GID 2: fe80:0000:0000:0000:ee61:05cf:2003:2622, IB/RoCE v1
GID 3: fe80:0000:0000:0000:ee61:05cf:2003:2622, RoCE v2
1 hbl_1 0 0000:4d:00.0
Port 1 ACTIVE
Port 2 ACTIVE
Port 3 ACTIVE
Port 4 ACTIVE
Port 5 ACTIVE
Port 6 ACTIVE
Port 7 ACTIVE
Port 8 ACTIVE
Port 9 ACTIVE enp77s0d8 b0:fd:0b:d6:11:d1 UP
GID 0: fe80:0000:0000:0000:b2fd:0bff:fed6:11d1, IB/RoCE v1
GID 1: fe80:0000:0000:0000:b2fd:0bff:fed6:11d1, RoCE v2
Port 10 ACTIVE
Port 11 ACTIVE
Port 12 ACTIVE
Port 13 ACTIVE
Port 14 ACTIVE
Port 15 ACTIVE
Port 16 ACTIVE
Port 17 ACTIVE
Port 18 ACTIVE
Port 19 ACTIVE
Port 20 ACTIVE
Port 21 ACTIVE
Port 22 ACTIVE
Port 23 ACTIVE enp77s0d22 b0:fd:0b:d6:11:df UP
GID 0: fe80:0000:0000:0000:b2fd:0bff:fed6:11df, IB/RoCE v1
GID 1: fe80:0000:0000:0000:b2fd:0bff:fed6:11df, RoCE v2
Port 24 ACTIVE enp77s0d23 b0:fd:0b:d6:11:e0 UP
GID 0: fe80:0000:0000:0000:b2fd:0bff:fed6:11e0, IB/RoCE v1
GID 1: fe80:0000:0000:0000:b2fd:0bff:fed6:11e0, RoCE v2
2 hbl_2 1 0000:4e:00.0
Port 1 ACTIVE
Port 2 ACTIVE
Port 3 ACTIVE
Port 4 ACTIVE
Port 5 ACTIVE
Port 6 ACTIVE
Port 7 ACTIVE
Port 8 ACTIVE
Port 9 ACTIVE enp78s0d8 b0:fd:0b:d6:11:41 UP
GID 0: fe80:0000:0000:0000:b2fd:0bff:fed6:1141, IB/RoCE v1
GID 1: fe80:0000:0000:0000:b2fd:0bff:fed6:1141, RoCE v2
GID 2: fe80:0000:0000:0000:f50b:fab9:0701:0995, IB/RoCE v1
GID 3: fe80:0000:0000:0000:f50b:fab9:0701:0995, RoCE v2
Port 10 ACTIVE
Port 11 ACTIVE
Port 12 ACTIVE
Port 13 ACTIVE
Port 14 ACTIVE
Port 15 ACTIVE
Port 16 ACTIVE
Port 17 ACTIVE
Port 18 ACTIVE
Port 19 ACTIVE
Port 20 ACTIVE
Port 21 ACTIVE
Port 22 ACTIVE
Port 23 ACTIVE enp78s0d22 b0:fd:0b:d6:11:4f UP
GID 0: fe80:0000:0000:0000:b2fd:0bff:fed6:114f, IB/RoCE v1
GID 1: fe80:0000:0000:0000:b2fd:0bff:fed6:114f, RoCE v2
GID 2: fe80:0000:0000:0000:6cea:4b01:4d3a:8521, IB/RoCE v1
GID 3: fe80:0000:0000:0000:6cea:4b01:4d3a:8521, RoCE v2
Port 24 ACTIVE enp78s0d23 b0:fd:0b:d6:11:50 UP
GID 0: fe80:0000:0000:0000:b2fd:0bff:fed6:1150, IB/RoCE v1
GID 1: fe80:0000:0000:0000:b2fd:0bff:fed6:1150, RoCE v2
GID 2: fe80:0000:0000:0000:1e6f:849d:cff8:4dbf, IB/RoCE v1
GID 3: fe80:0000:0000:0000:1e6f:849d:cff8:4dbf, RoCE v2
3 hbl_3 3 0000:34:00.0
Port 1 ACTIVE
Port 2 ACTIVE
Port 3 ACTIVE
Port 4 ACTIVE
Port 5 ACTIVE
Port 6 ACTIVE
Port 7 ACTIVE
Port 8 ACTIVE
Port 9 ACTIVE enp52s0d8 b0:fd:0b:d6:0f:91 UP
GID 0: fe80:0000:0000:0000:b2fd:0bff:fed6:0f91, IB/RoCE v1
GID 1: fe80:0000:0000:0000:b2fd:0bff:fed6:0f91, RoCE v2
GID 2: fe80:0000:0000:0000:9a0b:a21e:a76c:7959, IB/RoCE v1
GID 3: fe80:0000:0000:0000:9a0b:a21e:a76c:7959, RoCE v2
Port 10 ACTIVE
Port 11 ACTIVE
Port 12 ACTIVE
Port 13 ACTIVE
Port 14 ACTIVE
Port 15 ACTIVE
Port 16 ACTIVE
Port 17 ACTIVE
Port 18 ACTIVE
Port 19 ACTIVE
Port 20 ACTIVE
Port 21 ACTIVE
Port 22 ACTIVE
Port 23 ACTIVE enp52s0d22 b0:fd:0b:d6:0f:9f UP
GID 0: fe80:0000:0000:0000:b2fd:0bff:fed6:0f9f, IB/RoCE v1
GID 1: fe80:0000:0000:0000:b2fd:0bff:fed6:0f9f, RoCE v2
GID 2: fe80:0000:0000:0000:6bcf:2298:9bdc:d227, IB/RoCE v1
GID 3: fe80:0000:0000:0000:6bcf:2298:9bdc:d227, RoCE v2
Port 24 ACTIVE enp52s0d23 b0:fd:0b:d6:0f:a0 UP
GID 0: fe80:0000:0000:0000:b2fd:0bff:fed6:0fa0, IB/RoCE v1
GID 1: fe80:0000:0000:0000:b2fd:0bff:fed6:0fa0, RoCE v2
GID 2: fe80:0000:0000:0000:bf34:15a4:474b:9d8d, IB/RoCE v1
GID 3: fe80:0000:0000:0000:bf34:15a4:474b:9d8d, RoCE v2
4 hbl_4 6 0000:9a:00.0
Port 1 ACTIVE
Port 2 ACTIVE
Port 3 ACTIVE
Port 4 ACTIVE
Port 5 ACTIVE
Port 6 ACTIVE
Port 7 ACTIVE
Port 8 ACTIVE
Port 9 ACTIVE enp154s0d8 b0:fd:0b:d6:fc:f9 UP
GID 0: fe80:0000:0000:0000:b2fd:0bff:fed6:fcf9, IB/RoCE v1
GID 1: fe80:0000:0000:0000:b2fd:0bff:fed6:fcf9, RoCE v2
GID 2: fe80:0000:0000:0000:3c53:1f65:1ab6:332b, IB/RoCE v1
GID 3: fe80:0000:0000:0000:3c53:1f65:1ab6:332b, RoCE v2
Port 10 ACTIVE
Port 11 ACTIVE
Port 12 ACTIVE
Port 13 ACTIVE
Port 14 ACTIVE
Port 15 ACTIVE
Port 16 ACTIVE
Port 17 ACTIVE
Port 18 ACTIVE
Port 19 ACTIVE
Port 20 ACTIVE
Port 21 ACTIVE
Port 22 ACTIVE
Port 23 ACTIVE enp154s0d22 b0:fd:0b:d6:fd:07 UP
GID 0: fe80:0000:0000:0000:b2fd:0bff:fed6:fd07, IB/RoCE v1
GID 1: fe80:0000:0000:0000:b2fd:0bff:fed6:fd07, RoCE v2
GID 2: fe80:0000:0000:0000:cefa:1cfe:dc06:67f6, IB/RoCE v1
GID 3: fe80:0000:0000:0000:cefa:1cfe:dc06:67f6, RoCE v2
Port 24 ACTIVE enp154s0d23 b0:fd:0b:d6:fd:08 UP
GID 0: fe80:0000:0000:0000:b2fd:0bff:fed6:fd08, IB/RoCE v1
GID 1: fe80:0000:0000:0000:b2fd:0bff:fed6:fd08, RoCE v2
GID 2: fe80:0000:0000:0000:c711:6ac6:2147:2c50, IB/RoCE v1
GID 3: fe80:0000:0000:0000:c711:6ac6:2147:2c50, RoCE v2
5 hbl_5 7 0000:9b:00.0
Port 1 ACTIVE
Port 2 ACTIVE
Port 3 ACTIVE
Port 4 ACTIVE
Port 5 ACTIVE
Port 6 ACTIVE
Port 7 ACTIVE
Port 8 ACTIVE
Port 9 ACTIVE enp155s0d8 b0:fd:0b:d6:1c:e1 UP
GID 0: fe80:0000:0000:0000:b2fd:0bff:fed6:1ce1, IB/RoCE v1
GID 1: fe80:0000:0000:0000:b2fd:0bff:fed6:1ce1, RoCE v2
GID 2: fe80:0000:0000:0000:3e42:9f41:8ef4:80eb, IB/RoCE v1
GID 3: fe80:0000:0000:0000:3e42:9f41:8ef4:80eb, RoCE v2
Port 10 ACTIVE
Port 11 ACTIVE
Port 12 ACTIVE
Port 13 ACTIVE
Port 14 ACTIVE
Port 15 ACTIVE
Port 16 ACTIVE
Port 17 ACTIVE
Port 18 ACTIVE
Port 19 ACTIVE
Port 20 ACTIVE
Port 21 ACTIVE
Port 22 ACTIVE
Port 23 ACTIVE enp155s0d22 b0:fd:0b:d6:1c:ef UP
GID 0: fe80:0000:0000:0000:b2fd:0bff:fed6:1cef, IB/RoCE v1
GID 1: fe80:0000:0000:0000:b2fd:0bff:fed6:1cef, RoCE v2
GID 2: fe80:0000:0000:0000:c828:82c6:f129:9bab, IB/RoCE v1
GID 3: fe80:0000:0000:0000:c828:82c6:f129:9bab, RoCE v2
Port 24 ACTIVE enp155s0d23 b0:fd:0b:d6:1c:f0 UP
GID 0: fe80:0000:0000:0000:b2fd:0bff:fed6:1cf0, IB/RoCE v1
GID 1: fe80:0000:0000:0000:b2fd:0bff:fed6:1cf0, RoCE v2
GID 2: fe80:0000:0000:0000:6b0e:e95b:b9db:e6de, IB/RoCE v1
GID 3: fe80:0000:0000:0000:6b0e:e95b:b9db:e6de, RoCE v2
6 hbl_6 4 0000:b3:00.0
Port 1 ACTIVE
Port 2 ACTIVE
Port 3 ACTIVE
Port 4 ACTIVE
Port 5 ACTIVE
Port 6 ACTIVE
Port 7 ACTIVE
Port 8 ACTIVE
Port 9 ACTIVE enp179s0d8 b0:fd:0b:d6:10:21 UP
GID 0: fe80:0000:0000:0000:b2fd:0bff:fed6:1021, IB/RoCE v1
GID 1: fe80:0000:0000:0000:b2fd:0bff:fed6:1021, RoCE v2
GID 2: fe80:0000:0000:0000:4c6a:f15e:2585:c7a2, IB/RoCE v1
GID 3: fe80:0000:0000:0000:4c6a:f15e:2585:c7a2, RoCE v2
Port 10 ACTIVE
Port 11 ACTIVE
Port 12 ACTIVE
Port 13 ACTIVE
Port 14 ACTIVE
Port 15 ACTIVE
Port 16 ACTIVE
Port 17 ACTIVE
Port 18 ACTIVE
Port 19 ACTIVE
Port 20 ACTIVE
Port 21 ACTIVE
Port 22 ACTIVE
Port 23 ACTIVE enp179s0d22 b0:fd:0b:d6:10:2f UP
GID 0: fe80:0000:0000:0000:b2fd:0bff:fed6:102f, IB/RoCE v1
GID 1: fe80:0000:0000:0000:b2fd:0bff:fed6:102f, RoCE v2
GID 2: fe80:0000:0000:0000:1ce1:349c:9609:8235, IB/RoCE v1
GID 3: fe80:0000:0000:0000:1ce1:349c:9609:8235, RoCE v2
Port 24 ACTIVE enp179s0d23 b0:fd:0b:d6:10:30 UP
GID 0: fe80:0000:0000:0000:b2fd:0bff:fed6:1030, IB/RoCE v1
GID 1: fe80:0000:0000:0000:b2fd:0bff:fed6:1030, RoCE v2
GID 2: fe80:0000:0000:0000:c8ac:1510:e4cb:da89, IB/RoCE v1
GID 3: fe80:0000:0000:0000:c8ac:1510:e4cb:da89, RoCE v2
7 hbl_7 5 0000:b4:00.0
Port 1 ACTIVE
Port 2 ACTIVE
Port 3 ACTIVE
Port 4 ACTIVE
Port 5 ACTIVE
Port 6 ACTIVE
Port 7 ACTIVE
Port 8 ACTIVE
Port 9 ACTIVE enp180s0d8 b0:fd:0b:d6:0b:41 UP
GID 0: fe80:0000:0000:0000:b2fd:0bff:fed6:0b41, IB/RoCE v1
GID 1: fe80:0000:0000:0000:b2fd:0bff:fed6:0b41, RoCE v2
GID 2: fe80:0000:0000:0000:0f0d:3fe8:f1ba:17de, IB/RoCE v1
GID 3: fe80:0000:0000:0000:0f0d:3fe8:f1ba:17de, RoCE v2
Port 10 ACTIVE
Port 11 ACTIVE
Port 12 ACTIVE
Port 13 ACTIVE
Port 14 ACTIVE
Port 15 ACTIVE
Port 16 ACTIVE
Port 17 ACTIVE
Port 18 ACTIVE
Port 19 ACTIVE
Port 20 ACTIVE
Port 21 ACTIVE
Port 22 ACTIVE
Port 23 ACTIVE enp180s0d22 b0:fd:0b:d6:0b:4f UP
GID 0: fe80:0000:0000:0000:b2fd:0bff:fed6:0b4f, IB/RoCE v1
GID 1: fe80:0000:0000:0000:b2fd:0bff:fed6:0b4f, RoCE v2
GID 2: fe80:0000:0000:0000:9ac0:5abc:1cfc:6b75, IB/RoCE v1
GID 3: fe80:0000:0000:0000:9ac0:5abc:1cfc:6b75, RoCE v2
Port 24 ACTIVE enp180s0d23 b0:fd:0b:d6:0b:50 UP
GID 0: fe80:0000:0000:0000:b2fd:0bff:fed6:0b50, IB/RoCE v1
GID 1: fe80:0000:0000:0000:b2fd:0bff:fed6:0b50, RoCE v2
GID 2: fe80:0000:0000:0000:939f:f569:928f:fe08, IB/RoCE v1
GID 3: fe80:0000:0000:0000:939f:f569:928f:fe08, RoCE v2
HostNIC mlx5_0 0000:65:00.0
Port 1 ens7f0np0 b8:ce:f6:92:6b:8a UP 192.168.100.2/24 UP
GID 0: fe80:0000:0000:0000:bace:f6ff:fe92:6b8a, IB/RoCE v1
GID 1: fe80:0000:0000:0000:bace:f6ff:fe92:6b8a, RoCE v2
GID 2: 0000:0000:0000:0000:0000:ffff:c0a8:6402, IB/RoCE v1
GID 3: 0000:0000:0000:0000:0000:ffff:c0a8:6402, RoCE v2
HostNIC mlx5_1 0000:65:00.1
Port 1 ens7f1np1 b8:ce:f6:92:6b:8b UP 172.26.47.90/22 UP
GID 0: fe80:0000:0000:0000:bace:f6ff:fe92:6b8b, IB/RoCE v1
GID 1: fe80:0000:0000:0000:bace:f6ff:fe92:6b8b, RoCE v2
GID 2: fe80:0000:0000:0000:6762:eef8:f20b:e87c, IB/RoCE v1
GID 3: fe80:0000:0000:0000:6762:eef8:f20b:e87c, RoCE v2
GID 4: 0000:0000:0000:0000:0000:ffff:ac1a:2f5a, IB/RoCE v1
GID 5: 0000:0000:0000:0000:0000:ffff:ac1a:2f5a, RoCE v2
Example 4: :code:`python3 reportNcheck.py -c
Checks Gaudi NIC internal connectivity.
Collecting info...
Testing internal connectivity
0: (hbl_1, port 8)<------ Connected ------>(hbl_6, port 8)
1: (hbl_1, port 7)<------ Connected ------>(hbl_6, port 7)
2: (hbl_1, port 10)<------ Connected ------>(hbl_6, port 10)
3: (hbl_1, port 3)<------ Connected ------>(hbl_5, port 3)
4: (hbl_1, port 6)<------ Connected ------>(hbl_5, port 6)
5: (hbl_1, port 5)<------ Connected ------>(hbl_5, port 5)
6: (hbl_1, port 2)<------ Connected ------>(hbl_3, port 2)
7: (hbl_1, port 1)<------ Connected ------>(hbl_3, port 1)
8: (hbl_1, port 4)<------ Connected ------>(hbl_3, port 4)
9: (hbl_1, port 13)<------ Connected ------>(hbl_0, port 19)
10: (hbl_1, port 12)<------ Connected ------>(hbl_0, port 18)
11: (hbl_1, port 11)<------ Connected ------>(hbl_0, port 17)
12: (hbl_1, port 16)<------ Connected ------>(hbl_2, port 16)
13: (hbl_1, port 15)<------ Connected ------>(hbl_2, port 15)
14: (hbl_1, port 14)<------ Connected ------>(hbl_2, port 14)
15: (hbl_1, port 19)<------ Connected ------>(hbl_4, port 19)
16: (hbl_1, port 18)<------ Connected ------>(hbl_4, port 18)
17: (hbl_1, port 17)<------ Connected ------>(hbl_4, port 17)
18: (hbl_1, port 22)<------ Connected ------>(hbl_7, port 22)
19: (hbl_1, port 21)<------ Connected ------>(hbl_7, port 21)
20: (hbl_1, port 20)<------ Connected ------>(hbl_7, port 20)
21: (hbl_2, port 8)<------ Connected ------>(hbl_7, port 8)
22: (hbl_2, port 7)<------ Connected ------>(hbl_7, port 7)
23: (hbl_2, port 10)<------ Connected ------>(hbl_7, port 10)
24: (hbl_2, port 3)<------ Connected ------>(hbl_4, port 3)
25: (hbl_2, port 6)<------ Connected ------>(hbl_4, port 6)
26: (hbl_2, port 5)<------ Connected ------>(hbl_4, port 5)
27: (hbl_2, port 2)<------ Connected ------>(hbl_0, port 2)
28: (hbl_2, port 1)<------ Connected ------>(hbl_0, port 1)
29: (hbl_2, port 4)<------ Connected ------>(hbl_0, port 4)
30: (hbl_2, port 13)<------ Connected ------>(hbl_5, port 13)
31: (hbl_2, port 12)<------ Connected ------>(hbl_5, port 12)
32: (hbl_2, port 11)<------ Connected ------>(hbl_5, port 11)
33: (hbl_2, port 19)<------ Connected ------>(hbl_3, port 19)
34: (hbl_2, port 18)<------ Connected ------>(hbl_3, port 18)
35: (hbl_2, port 17)<------ Connected ------>(hbl_3, port 17)
36: (hbl_2, port 22)<------ Connected ------>(hbl_6, port 22)
37: (hbl_2, port 21)<------ Connected ------>(hbl_6, port 21)
38: (hbl_2, port 20)<------ Connected ------>(hbl_6, port 20)
39: (hbl_0, port 8)<------ Connected ------>(hbl_4, port 8)
40: (hbl_0, port 7)<------ Connected ------>(hbl_4, port 7)
41: (hbl_0, port 10)<------ Connected ------>(hbl_4, port 10)
42: (hbl_0, port 3)<------ Connected ------>(hbl_7, port 3)
43: (hbl_0, port 6)<------ Connected ------>(hbl_7, port 6)
44: (hbl_0, port 5)<------ Connected ------>(hbl_7, port 5)
45: (hbl_0, port 13)<------ Connected ------>(hbl_3, port 13)
46: (hbl_0, port 12)<------ Connected ------>(hbl_3, port 12)
47: (hbl_0, port 11)<------ Connected ------>(hbl_3, port 11)
48: (hbl_0, port 16)<------ Connected ------>(hbl_6, port 16)
49: (hbl_0, port 15)<------ Connected ------>(hbl_6, port 15)
50: (hbl_0, port 14)<------ Connected ------>(hbl_6, port 14)
51: (hbl_0, port 22)<------ Connected ------>(hbl_5, port 22)
52: (hbl_0, port 21)<------ Connected ------>(hbl_5, port 21)
53: (hbl_0, port 20)<------ Connected ------>(hbl_5, port 20)
54: (hbl_3, port 8)<------ Connected ------>(hbl_5, port 8)
55: (hbl_3, port 7)<------ Connected ------>(hbl_5, port 7)
56: (hbl_3, port 10)<------ Connected ------>(hbl_5, port 10)
57: (hbl_3, port 3)<------ Connected ------>(hbl_6, port 3)
58: (hbl_3, port 6)<------ Connected ------>(hbl_6, port 6)
59: (hbl_3, port 5)<------ Connected ------>(hbl_6, port 5)
60: (hbl_3, port 16)<------ Connected ------>(hbl_7, port 16)
61: (hbl_3, port 15)<------ Connected ------>(hbl_7, port 15)
62: (hbl_3, port 14)<------ Connected ------>(hbl_7, port 14)
63: (hbl_3, port 22)<------ Connected ------>(hbl_4, port 22)
64: (hbl_3, port 21)<------ Connected ------>(hbl_4, port 21)
65: (hbl_3, port 20)<------ Connected ------>(hbl_4, port 20)
66: (hbl_6, port 2)<------ Connected ------>(hbl_5, port 2)
67: (hbl_6, port 1)<------ Connected ------>(hbl_5, port 1)
68: (hbl_6, port 4)<------ Connected ------>(hbl_5, port 4)
69: (hbl_6, port 13)<------ Connected ------>(hbl_7, port 13)
70: (hbl_6, port 12)<------ Connected ------>(hbl_7, port 12)
71: (hbl_6, port 11)<------ Connected ------>(hbl_7, port 11)
72: (hbl_6, port 19)<------ Connected ------>(hbl_4, port 13)
73: (hbl_6, port 18)<------ Connected ------>(hbl_4, port 12)
74: (hbl_6, port 17)<------ Connected ------>(hbl_4, port 11)
75: (hbl_7, port 2)<------ Connected ------>(hbl_4, port 2)
76: (hbl_7, port 1)<------ Connected ------>(hbl_4, port 1)
77: (hbl_7, port 4)<------ Connected ------>(hbl_4, port 4)
78: (hbl_7, port 19)<------ Connected ------>(hbl_5, port 19)
79: (hbl_7, port 18)<------ Connected ------>(hbl_5, port 18)
80: (hbl_7, port 17)<------ Connected ------>(hbl_5, port 17)
81: (hbl_4, port 16)<------ Connected ------>(hbl_5, port 16)
82: (hbl_4, port 15)<------ Connected ------>(hbl_5, port 15)
83: (hbl_4, port 14)<------ Connected ------>(hbl_5, port 14)
84 connected.
Example 5: :code:`python3 reportNcheck.py -c` on PCIe Server
Checks Gaudi NIC connectivity on PCIe server. Tests Gaudi NIC in both quad 1 and quad 2 by default, without “-q” or “–quad” in command lines:
Collecting info...
Test internal connectivity on PCIe server quad 1:
0: (hbl_3, port 1)<------ Connected ------>(hbl_1, port 1)
1: (hbl_3, port 2)<------ Connected ------>(hbl_1, port 2)
2: (hbl_3, port 3)<------ Connected ------>(hbl_1, port 3)
3: (hbl_3, port 4)<------ Connected ------>(hbl_1, port 4)
4: (hbl_3, port 5)<------ Connected ------>(hbl_1, port 5)
5: (hbl_3, port 6)<------ Connected ------>(hbl_1, port 6)
6: (hbl_3, port 7)<------ Connected ------>(hbl_0, port 7)
7: (hbl_3, port 8)<------ Connected ------>(hbl_0, port 8)
8: (hbl_3, port 9)<------ Connected ------>(hbl_0, port 9)
9: (hbl_3, port 10)<------ Connected ------>(hbl_0, port 10)
10: (hbl_3, port 11)<------ Connected ------>(hbl_2, port 11)
11: (hbl_3, port 12)<------ Connected ------>(hbl_2, port 12)
12: (hbl_3, port 13)<------ Connected ------>(hbl_0, port 13)
13: (hbl_3, port 14)<------ Connected ------>(hbl_0, port 14)
14: (hbl_3, port 15)<------ Connected ------>(hbl_2, port 15)
15: (hbl_3, port 16)<------ Connected ------>(hbl_2, port 16)
16: (hbl_3, port 17)<------ Connected ------>(hbl_2, port 17)
17: (hbl_3, port 18)<------ Connected ------>(hbl_2, port 18)
18: (hbl_1, port 7)<------ Connected ------>(hbl_2, port 7)
19: (hbl_1, port 8)<------ Connected ------>(hbl_2, port 8)
20: (hbl_1, port 9)<------ Connected ------>(hbl_2, port 9)
21: (hbl_1, port 10)<------ Connected ------>(hbl_2, port 10)
22: (hbl_1, port 11)<------ Connected ------>(hbl_0, port 11)
23: (hbl_1, port 12)<------ Connected ------>(hbl_0, port 12)
24: (hbl_1, port 13)<------ Connected ------>(hbl_2, port 13)
25: (hbl_1, port 14)<------ Connected ------>(hbl_2, port 14)
26: (hbl_1, port 15)<------ Connected ------>(hbl_0, port 15)
27: (hbl_1, port 16)<------ Connected ------>(hbl_0, port 16)
28: (hbl_1, port 17)<------ Connected ------>(hbl_0, port 17)
29: (hbl_1, port 18)<------ Connected ------>(hbl_0, port 18)
30: (hbl_2, port 1)<------ Connected ------>(hbl_0, port 1)
31: (hbl_2, port 2)<------ Connected ------>(hbl_0, port 2)
32: (hbl_2, port 3)<------ Connected ------>(hbl_0, port 3)
33: (hbl_2, port 4)<------ Connected ------>(hbl_0, port 4)
34: (hbl_2, port 5)<------ Connected ------>(hbl_0, port 5)
35: (hbl_2, port 6)<------ Connected ------>(hbl_0, port 6)
36 connected.
Test internal connectivity on PCIe server quad 2:
0: (hbl_5, port 1)<------ Connected ------>(hbl_7, port 1)
1: (hbl_5, port 2)<------ Connected ------>(hbl_7, port 2)
2: (hbl_5, port 3)<------ Connected ------>(hbl_7, port 3)
3: (hbl_5, port 4)<------ Connected ------>(hbl_7, port 4)
4: (hbl_5, port 5)<------ Connected ------>(hbl_7, port 5)
5: (hbl_5, port 6)<------ Connected ------>(hbl_7, port 6)
6: (hbl_5, port 7)<------ Connected ------>(hbl_4, port 7)
7: (hbl_5, port 8)<------ Connected ------>(hbl_4, port 8)
8: (hbl_5, port 9)<------ Connected ------>(hbl_4, port 9)
9: (hbl_5, port 10)<------ Connected ------>(hbl_4, port 10)
10: (hbl_5, port 11)<------ Connected ------>(hbl_6, port 11)
11: (hbl_5, port 12)<------ Connected ------>(hbl_6, port 12)
12: (hbl_5, port 13)<------ Connected ------>(hbl_4, port 13)
13: (hbl_5, port 14)<------ Connected ------>(hbl_4, port 14)
14: (hbl_5, port 15)<------ Connected ------>(hbl_6, port 15)
15: (hbl_5, port 16)<------ Connected ------>(hbl_6, port 16)
16: (hbl_5, port 17)<------ Connected ------>(hbl_6, port 17)
17: (hbl_5, port 18)<------ Connected ------>(hbl_6, port 18)
18: (hbl_7, port 7)<------ Connected ------>(hbl_6, port 7)
19: (hbl_7, port 8)<------ Connected ------>(hbl_6, port 8)
20: (hbl_7, port 9)<------ Connected ------>(hbl_6, port 9)
21: (hbl_7, port 10)<------ Connected ------>(hbl_6, port 10)
22: (hbl_7, port 11)<------ Connected ------>(hbl_4, port 11)
23: (hbl_7, port 12)<------ Connected ------>(hbl_4, port 12)
24: (hbl_7, port 13)<------ Connected ------>(hbl_6, port 13)
25: (hbl_7, port 14)<------ Connected ------>(hbl_6, port 14)
26: (hbl_7, port 15)<------ Connected ------>(hbl_4, port 15)
27: (hbl_7, port 16)<------ Connected ------>(hbl_4, port 16)
28: (hbl_7, port 17)<------ Connected ------>(hbl_4, port 17)
29: (hbl_7, port 18)<------ Connected ------>(hbl_4, port 18)
30: (hbl_6, port 1)<------ Connected ------>(hbl_4, port 1)
31: (hbl_6, port 2)<------ Connected ------>(hbl_4, port 2)
32: (hbl_6, port 3)<------ Connected ------>(hbl_4, port 3)
33: (hbl_6, port 4)<------ Connected ------>(hbl_4, port 4)
34: (hbl_6, port 5)<------ Connected ------>(hbl_4, port 5)
35: (hbl_6, port 6)<------ Connected ------>(hbl_4, port 6)
36 connected.
Example 6: :code:`python3 reportNcheck.py -c -q 2` on PCIe Server
Test Gaudi NIC in quad 2 only with “-q” or “–quad” in command lines:
Collecting info...
Test internal connectivity on PCIe server quad 2:
0: (hbl_5, port 1)<------ Connected ------>(hbl_7, port 1)
1: (hbl_5, port 2)<------ Connected ------>(hbl_7, port 2)
2: (hbl_5, port 3)<------ Connected ------>(hbl_7, port 3)
3: (hbl_5, port 4)<------ Connected ------>(hbl_7, port 4)
4: (hbl_5, port 5)<------ Connected ------>(hbl_7, port 5)
5: (hbl_5, port 6)<------ Connected ------>(hbl_7, port 6)
6: (hbl_5, port 7)<------ Connected ------>(hbl_4, port 7)
7: (hbl_5, port 8)<------ Connected ------>(hbl_4, port 8)
8: (hbl_5, port 9)<------ Connected ------>(hbl_4, port 9)
9: (hbl_5, port 10)<------ Connected ------>(hbl_4, port 10)
10: (hbl_5, port 11)<------ Connected ------>(hbl_6, port 11)
11: (hbl_5, port 12)<------ Connected ------>(hbl_6, port 12)
12: (hbl_5, port 13)<------ Connected ------>(hbl_4, port 13)
13: (hbl_5, port 14)<------ Connected ------>(hbl_4, port 14)
14: (hbl_5, port 15)<------ Connected ------>(hbl_6, port 15)
15: (hbl_5, port 16)<------ Connected ------>(hbl_6, port 16)
16: (hbl_5, port 17)<------ Connected ------>(hbl_6, port 17)
17: (hbl_5, port 18)<------ Connected ------>(hbl_6, port 18)
18: (hbl_7, port 7)<------ Connected ------>(hbl_6, port 7)
19: (hbl_7, port 8)<------ Connected ------>(hbl_6, port 8)
20: (hbl_7, port 9)<------ Connected ------>(hbl_6, port 9)
21: (hbl_7, port 10)<------ Connected ------>(hbl_6, port 10)
22: (hbl_7, port 11)<------ Connected ------>(hbl_4, port 11)
23: (hbl_7, port 12)<------ Connected ------>(hbl_4, port 12)
24: (hbl_7, port 13)<------ Connected ------>(hbl_6, port 13)
25: (hbl_7, port 14)<------ Connected ------>(hbl_6, port 14)
26: (hbl_7, port 15)<------ Connected ------>(hbl_4, port 15)
27: (hbl_7, port 16)<------ Connected ------>(hbl_4, port 16)
28: (hbl_7, port 17)<------ Connected ------>(hbl_4, port 17)
29: (hbl_7, port 18)<------ Connected ------>(hbl_4, port 18)
30: (hbl_6, port 1)<------ Connected ------>(hbl_4, port 1)
31: (hbl_6, port 2)<------ Connected ------>(hbl_4, port 2)
32: (hbl_6, port 3)<------ Connected ------>(hbl_4, port 3)
33: (hbl_6, port 4)<------ Connected ------>(hbl_4, port 4)
34: (hbl_6, port 5)<------ Connected ------>(hbl_4, port 5)
35: (hbl_6, port 6)<------ Connected ------>(hbl_4, port 6)
36 connected.
Example 7: python3 reportNcheck.py -c -d <device_name>:<port_number> -d2 <device2_name>:<port2_number>
Manually specify the device and port for internal Gaudi NIC connectivity checks.
When used with the -c option, use -d and -d2 to define the Gaudi device and port using this format -d <device_name>:<port_number>
Note that the port number provided by -d or -d2 starts from 1 and not 0. For example:
:code:`python3 reportNcheck.py -c -d hbl_1:10 -d2 hbl_6:10`**
Collecting info...
Testing internal connectivity (manually specify)
0: (hbl_1, port 10)<------ Connected ------>(hbl_6, port 10)
1 connected.
Example 8: python3 reportNcheck.py -c -d <device_name>:<port_number>/SERVER_NAME - External Gaudi NIC Connectivity
Manually specify the device and port for external Gaudi NIC connectivity check between SERVER_NAME, hbl_3, port 23 and CLIENT_NAME, hbl_7, port 9.
On server, run
python3 reportNcheck.py -c -d <device_name>:<port_number>. For example,python3 reportNcheck.py -c -d hbl_3:23:Collecting info... Testing external connectivity This is server... 1: (SERVER_NAME, hbl_3, port 23)<------ Connected ------>(device and port on client)
On client, run
python3 reportNcheck.py -c -d <device_name>:<port_number> SERVER_NAME. For example,python3 reportNcheck.py -c -d hbl_7:9 -s SERVER_NAME:Collecting info... Testing external connectivity This is client... 1: (CLIENT_NAME, hbl_7, port 9)<------ Connected ------>(device and port on server SERVER_NAME)
Example 9: :code:`python3 reportNcheck.py -c -d <device_name>:<port_number>/SERVER_NAME - External host NIC Connectivity
Manually specify the device and port for external host NIC connectivity check between SERVER_NAME, mlx5_1, port 1 and CLIENT_NAME, mlx5_1, port 1.
On server, run
python3 reportNcheck.py -c -d <device_name>:<port_number>. For example,python3 reportNcheck.py -c -d mlx5_1:1:Collecting info... Testing external connectivity with Host NIC mlx5_1:1 This is server... 1: (SERVER_NAME, mlx5_1, port 1)<------ Connected ------>(device and port on client)
On client, run
python3 reportNcheck.py -c -d <device_name>:<port_number> SERVER_NAME. For example,python3 reportNcheck.py -c -d mlx5_1:1 -s SERVER_NAME:Collecting info... Testing external connectivity with Host NIC mlx5_1:1 This is client... 1: (CLIENT_NAME, mlx5_1, port 1)<------ Connected ------>(device and port on server SERVER_NAME)