Lustre LNet Networking Details
LNet (Lustre Networking) is the kernel-level networking infrastructure for Lustre, providing abstracted message passing over various networks like TCP/IP, InfiniBand, and Omni-Path. It handles routing, failover, load balancing, and high availability for clients, servers, and routers. This guide is based on Lustre 2.17.0 (January 2026), incorporating features like dynamic NID configuration and enhanced multi-rail (LMR). For recent JIRA updates, see LU-19763 (TCP zerocopy) and ongoing upstreaming efforts. Always refer to the Lustre Manual for full details.
Core Concepts
| Concept | Description |
|---|---|
| NID (Network Identifier) | Format: <IP|hostname>@<LNet-label> (e.g., 192.168.1.10@tcp0). Identifies nodes uniquely. |
| LNet Label | Format: <LND><number> (e.g., tcp0, o2ib0). Defines network types. |
| Router | Intermediate node for cross-network forwarding. |
| Peer | Remote node with NID; includes credits, health, and reference counts. |
| Credits | Flow control: send (tx), routing (rtr), buffer (peer_buffer_credits). |
| Portal | Kernel thread for message reception and dispatch to upper layers (e.g., ptlrpc). |
| CPT (CPU Partition) | Partitions messages across CPU cores for affinity and balancing. |
| Multi-Rail (LMR) | Bonds multiple interfaces/NIDs for bandwidth and redundancy (enhanced in 2.17). |
Features
- Multi-Network Support: Simultaneous TCP, InfiniBand, etc., with RDMA where available.
- Routing: Direct or routed paths; supports hops, priorities, asymmetrical routes (2.13+).
- Failover & HA: Automatic path switching on failures; health-based selection.
- Load Balancing: Round-robin, credit-based, portal rotor.
- Dynamic Discovery: Auto-detect peers (2.11+); PUSH messages for state changes.
- Health Monitoring: Tracks peer/router status (0-1000 health value); recovery pings.
- Dynamic NID Config (2.17+): Simplifies runtime network changes via C-API and lnetctl.
- YAML Config: Import/export for nets, peers, routes (2.11-2.13+).
- Self-Test: Tools for bandwidth/latency testing.
Modules and Supported Drivers
| Module/Driver | Role | Supported Networks |
|---|---|---|
| lnet | Core messaging, routing, credits. | All |
| libcfs | Kernel services, CPT, memory. | All |
| ksocklnd | TCP/IP driver. | Ethernet, IPoIB |
| o2iblnd | RDMA driver. | InfiniBand, Omni-Path |
| gni | Gemini/Aries driver. | HPC fabrics |
| ra | RapidArray driver. | Specialized |
| elan | Quadrics driver (legacy). | Legacy HPC |
| lnet_selftest | Testing framework. | All |
Load modules: modprobe libcfs; modprobe lnet; modprobe <LND>. Client and server use the same modules, but servers often require high-bandwidth LNDs like o2iblnd.
Configuration
Module Parameters (/etc/modprobe.d/lustre.conf)
options lnet networks="tcp0(eth0),o2ib0(ib0)"
options lnet ip2nets="tcp0(eth0) 192.168.0.[2,4]"
options lnet routes="tcp0 132.6.1.[1-8]@o2ib0; o2ib0 192.168.0.[1-8]@tcp0"
options ksocklnd credits=256 peer_credits=8
options o2iblnd conns_per_peer=4
Apply: modprobe -r lnet; modprobe lnet.
Runtime with lnetctl (2.7+)
# Initialize
lnetctl lnet configure
# Add network
lnetctl net add --net tcp2 --if eth0,eth1 --peer_timeout 180
# Add peer (multi-rail)
lnetctl peer add --prim_nid 10.10.10.2@tcp --nid 10.10.3.3@tcp1,10.4.4.5@tcp2
# Add route
lnetctl route add --net tcp2 --gateway 192.168.205.130@tcp1 --hop 2 --prio 1
# Enable routing
lnetctl set routing 1
# YAML import
lnetctl import config.yaml
For dynamic NID (2.17+): Use C-API for programmatic changes, e.g., lustre_lnet_config_net("tcp2", "eth0", 0, seq, &err).
Routing and Multi-Rail
- Routing: Use gateways for cross-LNet; select by hop, priority, health (2.13+). Asymmetrical allowed (drop_asym_route=0).
- Multi-Rail: Bond interfaces (e.g., multiple eth/ib); load balance with round-robin or credits. Failover on health drop.
Health Monitoring
Health value: 0-1000; decrements on failures, recovers via pings. Set sensitivity: lnetctl set health_sensitivity 100. View: lnetctl net show -v 3.
Discovery and Self-Test
- Discovery: Enable:
lnetctl set discovery 1; manual:lnetctl discover <nid>. - Self-Test:
lnetctl self_test create; lnetctl self_test runfor bandwidth/latency. Uselstfor advanced testing.
For client vs. server: Clients focus on multi-rail for bandwidth; servers on high-availability routing. Recent: TCP zerocopy default (LU-19763, 2026) improves performance.