Configure E2E Test in L3 Switching Environment
On this Page
Configure E2E Test in L3 Switching Environment¶
Creating E2E connectivity via L3 switches requires additional network and device configurations beyond those needed for L2 switching. This section describes the additional configuration requirements, how to obtain them, and how to configure the E2E test to utilize them.
A Layer 3 switch combines the functionalities of both a switch and a router. It serves as a switch to connect devices within the same subnet or virtual LAN, while also incorporating IP routing capabilities to function as a router. This allows it to support routing protocols, inspect incoming packets, and make routing decisions based on source and destination addresses. Layer 3 switch is commonly used for routing packets between different VLANs.
For example, Cluster07 in Intel Developer Cloud (IDC) has Arista switches configured as Layer 3 switches. This configuration requires assigning an IP address to each Gaudi port and using these addresses for communication between ports. Each Arista port is configured as its own subnet with a netmask of /30 or 255.255.255.252. This setup allows for four addresses in total, following the standard configuration - a broadcast address, a network address, and two node addresses. The Arista port itself is assigned the highest node address within the subnet. It is expected that the device connected to the port will use the other available node address. For example, the network 10.210.8.120/30 includes the following:
Network address: 10.210.8.120
Broadcast address: 10.210.8.123
Host IP range: 10.210.8.121 - 10.210.8.122
Arista port: 10.210.8.122
You can use https://www.calculator.net/ip-subnet-calculator.html for performing the calculation.
Prerequisites¶
If not already installed, make sure to have the latest Intel Gaudi software stack installed as detailed in the Installation Guide.
Note
If you are not using the latest Intel Gaudi software stack, make sure to install the correct version.
Configuration¶
Obtaining Arista Port Information¶
On the Arista host, load the habanalabs driver and bring up its interfaces:
Load habanalabs drivers:
Unload the drivers in this order - habanalabs, habanalabs_cn, habanalabs_en and habanalabs_ib:
sudo modprobe -r <driver name>
Load the drivers in this order - habanalabs_en and habanalabs_ib, habanalabs_cn, habanalabs:
sudo modprobe <driver name>
Bring up the interfaces:
/opt/habanalabs/qual/gaudi3/bin/manage_network_ifs.sh --up
Assuming eth5 is the interface in which you want to connect to, run
sudo lldpcli
:sudo lldpcli [lldpcli] # show neighbors ports eth5 ------------------------------------------------------------------------------- LLDP neighbors: ------------------------------------------------------------------------------- Interface: eth5, via: LLDP, RID: 15, Time: 0 day, 00:00:06 Chassis: ChassisID: mac 94:8e:d3:c8:52:69 SysName: 2b29u25n.idc9.habana-labs.com SysDescr: Arista Networks EOS version 4.26.4M running on an Arista Networks DCS-7060DX4-32 MgmtIP: 10.210.255.115 Capability: Bridge, on Capability: Router, on Port: PortID: ifname Ethernet22/7 PortDescr: no-alert 10.210.8.122/30 TTL: 120 -------------------------------------------------------------------------------
Obtaining Gaudi Port IP Information¶
Note
The numbers used in this section are examples only.
The Arista port is configured to show, in addition to other details, the following information:
MAC address: 94:8e:d3:c8:52:69
Port IP and netmask: 10.210.8.122/30
This information is used to determine the eth5 IP address (10.210.8.122/30) and the destination MAC address (94:8e:d3:c8:52:69). For example, to connect a device with port named eth5 to the above port/net, use address 10.210.8.121 as follows:
sudo ip addr add 10.210.8.121/30 dev eth5
sudo ifconfig eth5 up
The connectivity with Arista can be verified using ping 10.210.8.122
. This provides another way to determine the Arista MAC address will be used as the destination MAC address in your configuration.
The address can be viewed using ARP:
arp
Address HWtype HWaddress Flags Mask Iface
10.210.8.122 ether 94:8e:d3:c8:52:69 C eth5
To connect to a peer Gaudi port or an entire subnet, add the appropriate entry to the routing table:
sudo ip route add 10.210.0.0/16 via 10.210.8.122 dev eth5
Creating IP Connectivity Between the Peer Ports¶
The below steps, assume the following setup:
On host 0, perform the following steps:
Load habanalabs drivers:
Unload the drivers in this order - habanalabs, habanalabs_cn, habanalabs_en and habanalabs_ib:
sudo modprobe -r <driver name>
Load the drivers in this order - habanalabs_en and habanalabs_ib, habanalabs_cn, habanalabs:
sudo modprobe <driver name>
Bring up the interfaces:
/opt/habanalabs/qual/gaudi3/bin/manage_network_ifs.sh --up
Run
lldpcli show neighbors ports eth5
to obtain the ChassisID: mac 94:8e:d3:c9:88:2d and PortDescr: no-alert 10.210.8.122/30.Assign the IP address (10.210.8.121/30):
sudo ip addr add 10.210.8.121/30 dev eth5`
To check if the IP is assigned successfully, run the following command:
ping 10.210.8.122
Add a route to all Arista subnets. This can be done per subnet in case there is more than one port:
sudo ip route add 10.210.0.0/16 via 10.210.8.122 dev eth5
On host 1, perform the following steps:
Load habanalabs drivers:
Unload the drivers in this order - habanalabs, habanalabs_cn, habanalabs_en and habanalabs_ib:
sudo modprobe -r <driver name>
Load the drivers in this order - habanalabs_en and habanalabs_ib, habanalabs_cn, habanalabs:
sudo modprobe <driver name>
Bring up the interfaces:
/opt/habanalabs/qual/gaudi3/bin/manage_network_ifs.sh --up
Run
lldpcli show neighbors ports eth6
to obtain the ChassisID: mac 94:8e:d3:c8:52:69 and PortDescr: no-alert 10.210.15.174/30.Assign the IP address (10.210.15.173/30) to the eth6 interface:
sudo ip addr add 10.210.15.173/30 dev eth6 To check if the IP is assigned successfully, run the following command:ping 10.210.15.174
Add a route to all Arista subnets. This can be done per subnet in case there is more than one port:
sudo ip route add 10.210.0.0/16 via 10.210.15.174 dev eth6
Ping should now work between Host0:Port5 and Host1:Port6: To verify that the assigned IP address is routed correctly to the other host, run the following commands:
On Host 0, run:
ping 10.210.15.173
On Host1, run:
ping 10.210.8.121
Generating a Gaudinet.json Example¶
This example assumes a reference network design with a three-tier leaf-spine topology. Each Gaudi 3 server is connected to all three tiers via its 6 QSPF-DD ports: ports 1&4 to ply0, ports 2&5 to ply1 and ports 3&6 to ply2.
On the Gaudi 3 server side, the /etc/gaudinet.json
file is required. This file should include the Gaudi NIC MAC address, IP address,
subnet mask, and gateway MAC address for each of the 24 NICs in the following format:
{
"NIC_NET_CONFIG": [
{
"NIC_MAC": "b0:fd:0b:d9:22:4d",
"NIC_IP": "10.208.0.1",
"SUBNET_MASK": "255.255.255.252",
"GATEWAY_MAC": "e8:b2:65:79:b8:38"
},
{
"NIC_MAC": "b0:fd:0b:d9:22:5b",
"NIC_IP": "10.209.0.1",
"SUBNET_MASK": "255.255.255.252",
"GATEWAY_MAC": "ec:8a:48:43:c9:81"
},
{
"NIC_MAC": "b0:fd:0b:d9:22:5c",
"NIC_IP": "10.210.0.1",
"SUBNET_MASK": "255.255.255.252",
"GATEWAY_MAC": "ec:8a:48:44:3b:41"
},
…
]
}
To generate the :code:`gaudinet.json` file, perform the following steps:
From hl-smi, retrieve the mapping of Gaudi module ID to bus ID by running the following command:
hl-smi -Q module_id,bus_id -f csv,noheader
The first column in the output is the Gaudi module ID, while the second column is the bus ID as shown in the following example:
6, 0000:9a:00.0 2, 0000:33:00.0 3, 0000:34:00.0 7, 0000:9b:00.0 4, 0000:b3:00.0 0, 0000:4d:00.0 1, 0000:4e:00.0 5, 0000:b4:00.0
Obtain three MAC addresses (one address for each ply) for each Gaudi module:
Replace the bus_id in the following command with the bus_id found in Step 1:
cat /sys/bus/pci/drivers/habanalabs/{bus_id}/net/\*/address \| sort
To get the three MAC addresses for Gaudi module 0 in the above example, run the following:
cat /sys/bus/pci/drivers/habanalabs/0000:4d:00.0/net/\*/address \| sort b0:fd:0b:d9:22:4d #MAC for ply0 b0:fd:0b:d9:22:5b #MAC for ply1 b0:fd:0b:d9:22:5c #MAC for ply2Repeat the steps for Gaudi modules 1 through 7 to generate a list comprising of 24 lines in total.
Assign the NIC IP addresses. Use the following to determine the IP address format:
10.(starting_second_octect+ply_id).(leaf_switch_id).(1+port_seq_idx4)/30
The following table describes each parameter included in an IP address:
Parameter
Description
starting_second_octet
User’s choice
ply_id
0, 1, 2
leaf_switch_id
ID of the connected leaf switch
port_seq_id
The sequence number of the 100Gb/s interfaces across all servers connected to the same leaf switch. Each server has 8 interfaces connected to each of the 3 leaf switches, and the current server may not be the first server connected to a leaf switch.
/30 netmask
Subnet mask is 255.255.255.252 for point-to-point /30 network.
Examples 1: For the first server (connected to the lowest numbered switch port facing Gaudi servers) connected to the first leaf switch, the IP address is assigned to the NIC in Gaudi module 0 for ply0 as
10.(208+0).(0).(1+0x4)/30 = 10.208.0.1/30
.Examples 2: For the second server connected to the second leaf switch, the IP address is assigned to the NIC in Gaudi module 2 for ply1 as
10.(208+1).(1).(1+10x4)/30 = 10.209.1.41/30
. Here, port_seq_id is 10 because this is the second server connected to the switch whose first 8 Gaudis facing interfaces are connected to the first server. This NIC is in Gaudi module 2, so 8+2 = 10.
Pull the gateway MAC address which is the MAC address of the connected switch. It can be pulled either from the switch or the
lldpctl showneighbor
command on the server.
Priority Flow Control (PFC)¶
If degraded performance is observed, check the switch counters for dropped packets. If packet loss is detected, enabling PFC (Priority Flow Control) may be beneficial. Examples of commands and configurations are included in Arista EOS.
To check packet loss on the switch, use the following command:
show interface counter queue.
PFC/Buffer Configuration in Switch¶
To configure PFC/buffer in the switch, perform the following steps:
Add the following lines to the switch global configuration to adjust the buffer settings. Note that the threshold and headroom values provided are specific to the Arista 7060-DX4-32 and may vary for other switch models.
platform trident mmu queue profile PFC_Profile ingress threshold 1 ingress headroom 165100 platform trident mmu queue profile PFC_Profile apply
For each interface, add the following lines to enable PFC in its configuration:
qos trust dscp priority-flow-control on priority-flow-control priority 0 no-drop priority-flow-control priority 1 no-drop priority-flow-control priority 2 no-drop priority-flow-control priority 3 no-drop uc-tx-queue 2 no priority uc-tx-queue 3 no priority
Examples of a full interface configuration:
Example 1:
interface Ethernet1/1
mtu 9198
speed 400g-8
error-correction encoding reed-solomon
no switchport
ip address 10.208.128.1/30
qos trust dscp
priority-flow-control on
priority-flow-control priority 0 no-drop
priority-flow-control priority 1 no-drop
priority-flow-control priority 2 no-drop
priority-flow-control priority 3 no-drop
uc-tx-queue 2
no priority
uc-tx-queue 3
no priority
Example 2:
interface Ethernet2/1
mtu 9198
speed 100g-2
error-correction encoding reed-solomon
no switchport
ip address 10.208.0.2/30
qos trust dscp
priority-flow-control on
priority-flow-control priority 0 no-drop
priority-flow-control priority 1 no-drop
priority-flow-control priority 2 no-drop
priority-flow-control priority 3 no-drop
uc-tx-queue 2
no priority
uc-tx-queue 3
no priority
Enable PFC in Gaudi Server¶
PFC should also be enabled in servers for the flow control mechanism to function effectively.
To enable PFC, run the following command:
/opt/habanalabs/qual/gaudi3/bin/manage_network_ifs.sh --set-pfc
To verify that the PFC is enabled, run the following command:
/opt/habanalabs/qual/gaudi3/bin/manage_network_ifs.sh -check-pfc
You should receive output similar to the following, indicating enabled=15
.
check_pfc 'enp0n0'
enabled=15
check_pfc 'enp0n1'
enabled=15
Disable PFC in Gaudi Server¶
To disable PFC, run the following command:
/opt/habanalabs/qual/gaudi3/bin/manage_network_ifs.sh --unset-pfc
You should receive output similar to the following, indicating enabled=0
.
check_pfc 'enp0n0'
enabled=0
check_pfc 'enp0n1'
enabled=0