Scale-out via Host NIC

Multi-server scale-out for Intel® Gaudi® AI accelerators via Host NIC interfaces is enabled. This functionality is provided by HCCL library and activated either by an internal auto-detection mechanism or explicit selection by the user. Data transfer between nodes is supported with Scale-out via Host NIC over OFI mode.

Scale-out Auto-detection

By default, HCCL runs internal auto-detection and selection logic to select and use the most efficient method for scaling between nodes. Mode selection is done with the below priorities:

Priority

Scale-Out Mode

Conditions

P0

Gaudi NICs

Gaudi scale-out NICs are connected.

P1

Host NIC Gaudi Direct (GDR)

  • verbs or AWS EFA OFI provider is available:

    • EFA: libfabric version 1.16.1 or later

    • verbs: libfabric version 1.20.0.

  • Linux kernel version 5.12 or later.

P2

Host NICs using host memory

libfabric is available with one of the following providers: “efa”, “verbs” or “tcp”. To use verbs provider, refer to Enabling InfiniBand NICs (Verbs) for Host NIC Scaling.

P3

No scale-out

Note

  • HCCL auto-detection mechanism is disabled when setting HCCL_OVER_OFI environment variable (force P2).

  • Host NIC GDR with verbs provider is supported on Gaudi 3 and Gaudi 2.

Host NIC Gaudi Direct Setup

  • To enable Host NIC Gaudi Direct on AWS EFA, set RDMAV_FORK_SAFE=1 and FI_EFA_USE_DEVICE_RDMA=1 environment variables.

  • To enable Host NIC Gaudi Direct with verbs provider:

    • Set RDMAV_FORK_SAFE=1 and MLX5_SCATTER_TO_CQE=0 environment variables.

    • Disable PCIe Access Control (ACS).

    • Use libfabric version 1.20.0.

    • Build libfabric using --with-synapseai configuration option.

Scale-out via Host NIC over OFI

HCCL interacts with libfabric to utilize any underlying HW and networking mode.

Configuration Knobs

HCCL exports several environment variables that control the behavior of scale-out communication over libfabric. The table below lists the available environment variables.

Environment Variable

Description

HCCL_SOCKET_IFNAME

Identifies the network interface(s) that should be used for scale-out comms.

HCL_COMM_ID

Identifies the root process (rank 0) of the global communicator group. Typically set to <IPaddress:port> - the IP address of the network interface used by the root. This must be set for all HCCL processes when there is no alternate network to broadcast this.

Using Host NIC over OFI

  1. Download HCCL OFI Wrapper.

  2. Build and install libfabric.

  3. Build the HCCL OFI Wrapper.

Note

  • See additional instructions in HCCL OFI Wrapper page.

  • The above steps are not required when running Intel Gaudi containers as OFI Wrapper and libfabric are already installed by default.