Intel Gaudi RDMA PerfTest Tool

This document provides guidelines for installing and running the Intel® Gaudi® RDMA PerfTest tool habanalabs-perf-test on Gaudi accelerator. This tool is designed for low-level, high-performance testing of connectivity through ping-pong, bandwidth, and latency communication tests. It utilizes the Reliable Connection (RC) method and RDMA Write operations to deliver performance measurements.

Note

  • The tool is supported only with Gaudi 3.

  • The tool can only be used in a container and is not supported on a VM.

  • The tool can only be used with Gaudi 3 NICs.

Prerequisites

Make sure you have the following packages installed:

  • habanalabs-dkms

  • habanalabs-rdma-core

For more information about the packages installation, see Custom Driver and Software Installation.

Note

If you have upgraded to the 1.20.0 software version, the above packages are already included, and no additional installation is required.

Installation

  1. Download the package files:

    wget https://vault.habana.ai/artifactory/debian/noble/pool/main/h/habanalabs-perf-test/habanalabs-perf-test_1.20.0-543_amd64.deb
    
  2. Install the package files:

    sudo apt install ./habanalabs-perf-test_1.20.0-543_amd64.deb -y
    
  1. Download the package files:

    wget https://vault.habana.ai/artifactory/debian/jammy/pool/main/h/habanalabs-perf-test/habanalabs-perf-test_1.20.0-543_amd64.deb
    
  2. Install the package files:

    sudo apt install ./habanalabs-perf-test_1.20.0-543_amd64.deb -y
    
  1. Download the package files:

    wget https://vault.habana.ai/artifactory/rhel/8/8.6/habanalabs-perf-test-1.20.0-543.el8.x86_64.rpm
    
  2. Install the package files:

    sudo dnf install ./habanalabs-perf-test-1.20.0-543.el8.x86_64.rpm -y
    
  1. Download the package files:

    wget https://vault.habana.ai/artifactory/rhel/9/9.2/habanalabs-perf-test-1.20.0-543.el9.x86_64.rpm
    
  2. Install the package files:

    sudo dnf install ./habanalabs-perf-test-1.20.0-543.el9.x86_64.rpm -y
    
  1. Download the package files:

    wget https://vault.habana.ai/artifactory/rhel/9/9.4/habanalabs-perf-test-1.20.0-543.el9.x86_64.rpm
    
  2. Install the package files:

    sudo dnf install ./habanalabs-perf-test-1.20.0-543.el9.x86_64.rpm -y
    
  1. Download the package files:

    wget https://vault.habana.ai/artifactory/tencentos/3/3.1/habanalabs-perf-test-1.20.0-543.tl3.x86_64.rpm
    
  2. Install the package files:

    sudo dnf install ./habanalabs-perf-test-1.20.0-543.tl3.x86_64.rpm -y
    
  1. Download the package files:

    wget https://vault.habana.ai/artifactory/sles/15/15.5/habanalabs-perf-test-1.20.0-543.x86_64.rpm
    
  2. Install the package files:

    sudo zypper install ./habanalabs-perf-test-1.20.0-543.x86_64.rpm -y
    

Options and Usage

RDMA PerfTest tool is executed using cloud_run.py Python wrapper script. It runs across an entire data center, covering all pairwise permutations for thorough evaluation.

The following port connectivity options are supported for testing on multiple nodes:

../../_images/connectivity_topology.png

Prerequisites

The below lists the prerequisites needed to run the cloud_run.py Python wrapper script:

  • Install the requirements file:

    pip install -r requirements.txt
    
  • Make sure all tested nodes have SSH keys configured for seamless access. The script relies on SSH sessions established via SSH keys.

    Note

    Starting from 1.18.0 release, SSH host keys have been removed from Dockers. To add them, make sure to run /usr/bin/ssh-keygen -A inside the Docker container. If you are running on Kubernetes, make sure the SSH host keys are identical across all Docker containers. To achieve this, you can either build a new Docker image on top of Intel Gaudi Docker image by adding a new layer RUN /usr/bin/ssh-keygen -A, or externally mount the SSH host keys.

  • Verify that all external NIC ports are active and accessible on each node. For more details, see Disable/Enable NICs.

  • Prepare a host file listing all tested nodes with their SSH connection details (IP and port). Use the following format: SSH-IP:SSH-PORT. For example:

    kuku-kvm12-lake:22
    kuku-kvm13-lake:22
    kuku-kvm14-lake:22
    kuku-kvm15-lake:22
    
  • Configure config.sh script on for all tested nodes by implementing the following:

    1. Write the LD_LIBRARY_PATH environment variable to a file. This ensures it can be accessed in remote SSH sessions during testing:

    echo "LD_LIBRARY_PATH=${LD_LIBRARY_PATH}" | tee ~/.ENV_SCALEUP
    
    1. Create gaudinet.json file and configure it as described in the section below.

    Once done, set file the path by running the below:

    echo "GAUDINET_PATH=<PATH/TO/guadinet.json>" | tee -a ~/.ENV_SCALEUP
    
    1. (ping-pong and write_bw tests only) Create server_internal_connectivity.csv file and configure it as described in Internal Ports Testing.

    Once done, set the file path by running the below:

    echo "SERVER_INTERNAL_CONNECTIVITY_PATH=<PATH/TO/server_internal_connectivity.csv>" | tee -a ~/.ENV_SCALEUP
    
    1. Apply the script for all tested nodes:

    bash ./config.sh
    

Configuring gaudinet.json

The gaudinet.json file is used to configure network settings for Layer 3 (L3) routes and it includes NIC MAC addresses, IP addresses, subnet masks, and associated gateway MAC addresses for each NIC in the following format:

{
   "NIC_NET_CONFIG": [
      {
         "NIC_MAC": "00:1A:2B:3C:4D:5E",
         "NIC_IP": "192.168.1.10",
         "SUBNET_MASK": "255.255.255.0",
         "GATEWAY_MAC": "00:1A:2B:3C:4D:5F"
      }
   ]
}

To obtain the required data, refer to the instructions in Intel Gaudi Network Configuration guide.

Each object inside the NIC_NET_CONFIG array corresponds to the configuration of a single NIC. The following table describes each object used in the gaudinet.json:

Object

Type

Description

Format Example

NIC_MAC

String

NIC MAC address. This field is required and must follow the standard MAC address format.

00:1A:2B:3C:4D:5E

NIC_IP

String

IP address assigned to the NIC. Must be in a valid IPv4 or IPv6 format.

192.168.1.10

SUBNET_MASK

String

Subnet mask defining the network’s address range.

255.255.255.0

GATEWAY_MAC

String

MAC address of the gateway through which the NIC routes its traffic. This field must follow the standard MAC address format.

00:1A:2B:3C:4D:5F

Internal Ports Testing

Note

The internal ports testing is supported for ping-pong and write_bw tests only.

To perform testing on the internal ports, follow the steps below:

  1. Review example configurations of the server_internal_connectivity.csv file in the internal_data folder. The example shows the internal NIC connectivity map tables. Verify if any of these configurations match your server setup. If a configuration matches, use it for your setup.

  2. If none of the examples match your configuration, create a new internal NIC port connectivity map in CSV format using the following template:

    <source-device-module-id>,<source-port-number>,<destination-device-module-id>,<destination-port-number>
    

    Note

    In the CSV file, every module ID present on the server must appear in the source-device-module-id column. For example, if module OAM_i is connected to module OAM_j, the CSV should include the following:

    i,0,j,1
    j,1,i,0
    
  3. Add the --internal switch after perftest in the tool command line as shown in Test-specific Options.

Python Wrapper Options

Use the -h argument to view all options. The table below describes all the cloud_run.py options available.

Option

Description

-h, --help

Show the help message and exit

-hf, --host_file

Path to a host_file that includes a host IP list

-skt, --ssh_key_type

SSH key type

-skf, --ssh_key_file

SSH private key file path (default: /home/username/.ssh/id_rsa)

-o, --output

Save all the log files in a specific path (the flag must be set)

PerfTest Options

Use the -h argument to view all options. The table below describes all the perftest options available.

Option

Description

-h, --help

Show the help message and exit

-tp, --tcp_port

Specify the TCP port range script will use (default:1100)

-int, --internal

Enable internal NIC ports testing. Only supported with ping-pong and write_bw tests.

-dis_ext, --disable_external

Disable external NIC ports testing. Only supported with ping-pong and write_bw tests and when the :code`–internal` flag is set.

Test-specific Options

Use the -h argument to view all options. The tables below describe all the testing options available.

Option

Description

-h, --help

Show the help message and exit

-s, --size

Size of message to exchange (default: 4096)

-r, --rx_depth

Number of receives to post at a time (default: 128)

-n, --iters

Number of exchanges (default: 10)

-c, --chk

Validate received buffer

Example:

python3 ./cloud_run.py --host_file ./hostfile --output /tmp/output perftest --internal --tcp_port 1100 ping_pong --size 4096 --rx_depth 128 --iters 10 --chk

Option

Description

-h, --help

Show the help message and exit

-s, --size

Size of message to exchange (default: 4096)

-r, --rx_depth

Number of receives to post at a time (default: 128)

-n, --iters

Number of exchanges (default: 10)

-c, --criteria

Pass/fail criteria value for the test threshold in Gbps (default: not used)

Example:

python3 ./cloud_run.py --host_file ./hostfile --output /tmp/output perftest --internal --tcp_port 1100 write_bw --size 1048575 --rx_depth 128 --iters 100000

Option

Description

-h, --help

Show the help message and exit

-s, --size

Size of message to exchange (default: 1024)

-r, --rx_depth

Number of receives to post at a time (default: 128)

-n, --iters

Number of exchanges (default: 500000)

-c, --criteria

Pass/fail criteria value for the test threshold in ms (default: not used)

Example:

python3 ./cloud_run.py --host_file ./hostfile --output /tmp/output perftest --tcp_port 1100 write_lat --size 1024 --rx_depth 128 --iters 100000

Expected output:

* CloudReport_<timestamp>.txt - Tested nodes summary.
* <server_host_name>_<client_host_name>
  └── scaleUpRepor_<timestamp>.txt - Specific server and client pair summary.
  └── perftest
    └── <network_ip>
      └── <server - device (ib_dev)>
        └── <device_port (ib_port)>
          └── <client - device (ib_dev)>
            └── <device_port (ib_port)>.txt - Both device application prints.

The output is saved in the output directory in a timestamp-named folder.