Simple 3-Node Lustre Cluster Tutorial
This tutorial sets up a basic 3-node Lustre cluster for testing: Node 1 (MGS/MDT), Node 2 (OSS with 2 OSTs), Node 3 (client). Assumes RHEL 9.x (adjust for other distros). Use dedicated block devices (e.g., /dev/sdb for MDT /dev/sdc, /dev/sdd for OSTs). For production, add HA (e.g., Pacemaker) and RAID. Based on Lustre 2.17.0 (January 2026). Not for production, use with caution.
Prerequisites
- 3 nodes with RHEL 9.x (or derivative); synchronized clocks (NTP).
- Install Lustre packages on all nodes (see installation guides).
- Block devices: Node 1 (/dev/sdb for MDT), Node 2 (/dev/sdc, /dev/sdd for OSTs).
- Network: TCP/IP (eth0); for IB/RoCE, install OFED and IB drivers.
- Disable firewalls/SELinux for testing:
systemctl stop firewalld; setenforce 0. - Same UID/GID across nodes; SSH access.
TCP/IP Configuration (All Nodes)
# Load LNet
modprobe lnet
lnetctl lnet configure
# Add network (adjust IP/subnet)
lnetctl net add --net tcp0 --if eth0
# Verify NIDs
lctl list_nids # e.g., 192.168.1.1@tcp0
Node 1: MGS/MDT Setup
# Format combined MGS/MDT
mkfs.lustre --fsname=testfs --mgs --mdt --index=0 /dev/sdb
# Mount
mkdir /mnt/testfs-mdt0
mount.lustre /dev/sdb /mnt/testfs-mdt0
# Verify
lfs df -h
Node 2: OSS with 2 OSTs Setup
# Format OST1 (use Node1 NID)
mkfs.lustre --fsname=testfs --ost --index=0 --mgsnode=192.168.1.1@tcp0 /dev/sdc
# Format OST2
mkfs.lustre --fsname=testfs --ost --index=1 --mgsnode=192.168.1.1@tcp0 /dev/sdd
# Mount OSTs
mkdir /mnt/testfs-ost0 /mnt/testfs-ost1
mount.lustre /dev/sdc /mnt/testfs-ost0
mount.lustre /dev/sdd /mnt/testfs-ost1
# Verify
lfs df -h # From Node1 or Node3 after client mount
Node 3: Client Setup
# Mount (use Node1 NID)
mkdir /mnt/testfs
mount -t lustre 192.168.1.1@tcp0:/testfs /mnt/testfs
# Verify
lfs df -h
df -h /mnt/lustre
Usage and Testing
# On client: Create file
dd if=/dev/zero of=/mnt/testfs/testfile bs=1M count=100
# Check striping
lfs getstripe /mnt/testfs/testfile
# Cleanup (unmount client first, then OSTs, then MDT)
umount /mnt/testfs # Node3
umount /mnt/testfs-ost0 /mnt/testfs-ost1 # Node2
umount /mnt/testfs-mdt0 # Node1
InfiniBand/RoCE Instructions
- Install OFED:
dnf install infiniband-diags perftest qperf rdma-core. - Load o2iblnd:
modprobe o2iblnd. - Configure LNet: Replace tcp0 with o2ib0, interfaces with ib0 (e.g.,
lnetctl net add --net o2ib0 --if ib0). - NIDs: Use IB IPs (e.g., 192.168.2.1@o2ib0).
- Mount: Use o2ib NIDs (e.g.,
mount -t lustre 192.168.2.1@o2ib0:/testfs /mnt/testfs). - For multi-rail: Add multiple nets (o2ib0, o2ib1); use colon for failover NIDs in mount.
- Test:
ibstatus; lnetctl ping <nid>.
Warnings and Best Practices
- Not production-ready: Add HA (Pacemaker) for automatic failover.
- Use RAID for storage device redundancy; Lustre doesn't provide it.
- Monitor with
lfs df,lctl get_param. - For IB/RoCE: Ensure RDMA works; use separate subnets if needed.
- Scale: Add more OSTs/OSSes and MDTs/MDSes for performance and capacity.