Scale-Out via Host-NIC¶
Multi-node scale-out for Gaudi accelerator devices via host NIC interfaces is enabled. This functionality is provided by HCCL library. Data transfer between nodes is supported with two modes:
Scale-Out via Host-NIC over TCP¶
Creating TCP socket connections between host NIC interfaces. HCCL can use multiple TCP connections between communication nodes in order to maximize bandwidth utilization.
Configuration Knobs¶
HCCL exports several environment variables that control the behavior of the TCP connections that are used for scale-out communication over host NICs. The table below lists the environment variables needed.
Environment Variable |
Description |
---|---|
HCCL_OVER_TCP |
Enables scale-out communications over TCP. Possible values are 0 (disable) or 1 (enable). Default value is 0. |
HCCL_SOCKET_IFNAME |
Identifies the network interface(s) that should be used for scale-out comms. |
HCCL_DEFAULT_NIC_COUNT |
Limits the number of network interfaces that are used by HCCL. Select subset (first-N) of all network interfaces that are detected based on HCCL_SOCKET_IFNAME setting. Default value is 4. |
HCL_COMM_ID |
Identifies the root process (rank 0) of the global communicator group. Typically set to <IPaddress:port> the IP address of the network interface used by the root. This must be set for all HCCL processes when there is no alternate network to broadcast this. |
HCCL_SOCKET_NTHREADS |
Specifies the total number of CPU threads (per HCCL process) created to handle the data communication over TCP sockets. Default value is 2. |
HCCL_NSOCKS_PERTHREAD |
Specifies the number of TCP socket connections served by single CPU thread. Default value is 3. |
Usage¶
To use scale-out communication over host NICs, at least 2 nodes in the global communicator is required.
One HCCL process is run for every Gaudi accelerator device – in a setup with 2 nodes and 8 Gaudi devices on each node,
there would be a total of 16 HCCL processes (ranks), 8 on each node. The recommended method is to launch each process
with appropriate environment settings; some variables such as HCCL_SOCKET_IFNAME
take different values for different processes.
Example¶
Run all-reduce on 2 nodes with 8 ranks each in HCCL_OVER_TCP mode:
$HCCL_SOCKET_IFNAME=eth1,eth2,eth3,eth4 HCCL_DEFAULT_NIC_COUNT=4
HCCL_COMM_ID=10.111.14.155:9696 HCCL_SOCKET_NTHREADS=2 HCCL_NSOCKS_PERTHREAD=3
HCCL_OVER_TCP=1 python3.6 run_hccl_demo.py -nranks 16 -node_id 1 -test all_reduce
Scale-Out via Host-NIC over OFI¶
HCCL interacts with libFabric to utilize any underlying HW and networking mode.
Configuration Knobs¶
HCCL exports several environment variables that control the behavior of scale-out communication over libFabric. The table below lists the environment variables needed.
Environment Variable |
Description |
---|---|
HCCL_OVER_OFI |
Enables scale-out communications over OFI libFabric. Possible values are 0 (disable) or 1 (enable).Default value is 0. |
HCCL_SOCKET_IFNAME |
Identifies the network interface(s) that should be used for scale-out comms. |
HCL_COMM_ID |
Identifies the root process (rank 0) of the global communicator group. Typically set to <IPaddress:port> the IP address of the network interface used by the root. This must be set for all HCCL processes when there is no alternate network to broadcast this. |
Usage¶
Download HCCL OFI Wrapper.
Build and install libFabric.
Build HCCL OFI Wrapper.
Note
If you are using Habana Containers, steps 2 is not required as libFabric is already installed by default.
To check the HCCL OFI Wrapper built, run your test while setting the environment variable
HCCL_OVER_OFI=1
.
Note
Make sure to disable TCP flag by setting the environment variable
HCCL_OVER_TCP=0
.