Scale-Out via Host-NIC

Multi-node scale-out for Gaudi accelerator devices via host NIC interfaces is enabled. This functionality is provided by HCCL library. Data transfer between nodes is supported with two modes:

Scale-Out via Host-NIC over TCP

Creating TCP socket connections between host NIC interfaces. HCCL can use multiple TCP connections between communication nodes in order to maximize bandwidth utilization.

Configuration Knobs

HCCL exports several environment variables that control the behavior of the TCP connections that are used for scale-out communication over host NICs. The table below lists the environment variables needed.

Environment Variable

Description

HCCL_OVER_TCP

Enables scale-out communications over TCP. Possible values are 0 (disable) or 1 (enable). Default value is 0.

HCCL_SOCKET_IFNAME

Identifies the network interface(s) that should be used for scale-out comms.

HCCL_DEFAULT_NIC_COUNT

Limits the number of network interfaces that are used by HCCL. Select subset (first-N) of all network interfaces that are detected based on HCCL_SOCKET_IFNAME setting. Default value is 4.

HCL_COMM_ID

Identifies the root process (rank 0) of the global communicator group. Typically set to <IPaddress:port> the IP address of the network interface used by the root. This must be set for all HCCL processes when there is no alternate network to broadcast this.

HCCL_SOCKET_NTHREADS

Specifies the total number of CPU threads (per HCCL process) created to handle the data communication over TCP sockets. Default value is 2.

HCCL_NSOCKS_PERTHREAD

Specifies the number of TCP socket connections served by single CPU thread. Default value is 3.

Usage

To use scale-out communication over host NICs, at least 2 nodes in the global communicator is required. One HCCL process is run for every Gaudi accelerator device – in a setup with 2 nodes and 8 Gaudi devices on each node, there would be a total of 16 HCCL processes (ranks), 8 on each node. The recommended method is to launch each process with appropriate environment settings; some variables such as HCCL_SOCKET_IFNAME take different values for different processes.

Example

Run all-reduce on 2 nodes with 8 ranks each in HCCL_OVER_TCP mode:

$HCCL_SOCKET_IFNAME=eth1,eth2,eth3,eth4 HCCL_DEFAULT_NIC_COUNT=4
HCCL_COMM_ID=10.111.14.155:9696 HCCL_SOCKET_NTHREADS=2 HCCL_NSOCKS_PERTHREAD=3
HCCL_OVER_TCP=1 python3.6 run_hccl_demo.py -nranks 16 -node_id 1 -test all_reduce

Scale-Out via Host-NIC over OFI

HCCL interacts with libFabric to utilize any underlying HW and networking mode.

Configuration Knobs

HCCL exports several environment variables that control the behavior of scale-out communication over libFabric. The table below lists the environment variables needed.

Environment Variable

Description

HCCL_OVER_OFI

Enables scale-out communications over OFI libFabric. Possible values are 0 (disable) or 1 (enable).Default value is 0.

HCCL_SOCKET_IFNAME

Identifies the network interface(s) that should be used for scale-out comms.

HCL_COMM_ID

Identifies the root process (rank 0) of the global communicator group. Typically set to <IPaddress:port> the IP address of the network interface used by the root. This must be set for all HCCL processes when there is no alternate network to broadcast this.

Usage

  1. Download HCCL OFI Wrapper.

  2. Build and install libFabric.

  3. Build HCCL OFI Wrapper.

Note

If you are using Habana Containers, steps 2 is not required as libFabric is already installed by default.

  1. To check the HCCL OFI Wrapper built, run your test while setting the environment variable HCCL_OVER_OFI=1.

Note

Make sure to disable TCP flag by setting the environment variable HCCL_OVER_TCP=0.