Back

NFSoRDMA

Previously used IPoIB, consult this page External Link

With newer hardware (storage and compute nodes) and an EDR Infiniband switch (expensive!) we will try NFSoRDMA.

https://enterprise-support.nvidia.com/s/article/howto-configure-nfs-over-rdma--roce-x

Remote Direct Memory Access supposedly gets better performance than IPoIB. Clients (compute nodes) fetch data directly from storage server's memory, so the remote storage “appears as if it is local”.

Our EDR Infiniband switch (100 GbE) is more expensive than an EDR Ethernet switch but obtains better performance.

Packages involved

and

[root@astrostore ~]# ibstat
CA 'mlx5_0'
...

# eth0 & eth1 for warewulf provisions over eth0
[root@astrostore ~]# cat /etc/default/grub 
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="resume=UUID=05ade8ca-39db-42a8-b280-c5c835d7633f net.ifnames=0"
GRUB_DISABLE_RECOVERY="true"
GRUB_ENABLE_BLSCFG=true


[root@astrostore ~]# lsmod | grep rdma
rpcrdma               307200  2
rdma_ucm               32768  0
rdma_cm               118784  5 rpcrdma,ib_srpt,ib_iser,ib_isert,rdma_ucm
iw_cm                  53248  1 rdma_cm
ib_cm                 114688  3 rdma_cm,ib_ipoib,ib_srpt
ib_uverbs             163840  2 rdma_ucm,mlx5_ib
ib_core               397312  12 rdma_cm,ib_ipoib,rpcrdma,ib_srpt,iw_cm,ib_iser,ib_umad,ib_isert,rdma_ucm,ib_uverbs,mlx5_ib,ib_cm
sunrpc                577536  20 nfsd,rpcrdma,auth_rpcgss,lockd,nfs_acl


[root@astrostore ~]# modinfo rpcrdma
filename:       /lib/modules/4.18.0-425.3.1.el8.x86_64/kernel/net/sunrpc/xprtrdma/rpcrdma.ko.xz
alias:          rpcrdma6
alias:          xprtrdma
alias:          svcrdma
license:        Dual BSD/GPL
description:    RPC/RDMA Transport
author:         Open Grid Computing and Network Appliance, Inc.
rhelversion:    8.7
srcversion:     2BB7046D96C4B57D1B36DEE
depends:        ib_core,sunrpc,rdma_cm
intree:         Y
name:           rpcrdma
vermagic:       4.18.0-425.3.1.el8.x86_64 SMP mod_unload modversions 
sig_id:         PKCS#7
signer:         Rocky kernel signing key
sig_key:        54:B0:44:38:9B:6E:F7:43:B2:FD:8F:3B:8B:D3:69:22:26:2A:BE:87
sig_hashalgo:   sha256
signature:      A8:51:36:....

[root@astrostore ~]# cat /etc/exports
/lvm_data *(fsid=0,rw,async,insecure,no_root_squash)

[root@astrostore ~]# exportfs
/lvm_data     	<world>

# weird! yet nfsd is running, I observe 8 nfsd processes
[root@astrostore ~]# systemctl status nfs.service
Unit nfs.service could not be found.

# ohhh
[root@astrostore ~]# systemctl status nfs-server.service
● nfs-server.service - NFS server and services
   Loaded: loaded (/usr/lib/systemd/system/nfs-server.service; enabled; vendor preset: disabled)
  Drop-In: /run/systemd/generator/nfs-server.service.d
           └─order-with-mounts.conf
   Active: active (exited) since Fri 2023-02-03 10:42:04 EST; 3 days ago
 ...  
   
# port
 sysctl -w fs.nfsd.portlist=20049

[root@astrostore ~]#  cat /proc/fs/nfsd/portlist
rdma 20049
rdma 20049
tcp 2049
tcp 2049

# bring up interface
 ip addr add 10.11.103.243/16 dev ib0

### then on compute node

10.11.103.243:/lvm_data /astrostore nfs  defaults,proto=rdma,port=20049	  0 0 

# fini


with firewalld add this port

https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/8/html/configuring_infiniband_and_rdma_networks/configuring-the-core-rdma-subsystem_configuring-infiniband-and-rdma-networks#enabling-nfs-over-rdma-on-an-nfs-server_configuring-the-core-rdma-subsystem

 firewall-cmd --permanent --zone=public --add-port={20049/tcp,20049/udp}
 firewall-cmd --permanent --zone=private --add-port={20049/tcp,20049/udp}

 firewall-cmd --reload

systemctl restart nfs-server

the also add nfs, mountd and rpc-bind rules for ib0


Back