Back

IB BIOS settings

   Here are the BIOS setting we set for those system prior to shipping;

     Start by entering the BIOS and taking the "Optimized Defaults" (F3)

      Then going down through the menus on the "Advanced" tab in the
BIOS...

      Boot Feature;
          Quiet Boot = disabled
          Wait for "F1" if Error = disabled

      CPU Configuration:
          Extended APIC = enabled
             "Advanced Power Management Configuration"
                 Power technology = Custom
                 Power Performance Tuning= BIOS controls EPB
                 ENERGY_PERF_BIAS_CFG Mode = Max Performance

      PCIe/PCI/Pnp Configuration
          Above 4Gb Decoding = enabled

      Then arrow right over to the "Boot" tab at the top

      Boot Mode Select = UEFI

      Then you set the "Fixed Boot Order Priorities"
      Typically we set them like this but the choices may vary depending
on boot mode and what you have installed for components.

      USB Floppy
      USB CD/DVD
      USB Hard Disk
      USB LAN
      USB Key
      UEFI CD/DVD
      UEFI Hard Disk
      UEFI Network

     Some of the boot order options that I have above like "UEFI
Network" might have a slightly different name but it's the idea of
external devices like USB, then local like a CD/DVD drive, then a hard
drive to boot off of and then the network last.

     Once you have the changes made...hit "F4" to save the changes and
the system will reboot.

     Check to make sure your IB card is showing then.

I think the "Extended APIC" option being enabled allows your IB card to show.
    
    
The CPUs control the PCIe lanes so that options allows more "space" for the PCI lanes.

MFT fix

We got a new raid card. ib0 device not available

After stepping through all memory configs from 2 dimms to 16 dimms

Then we added the card.

The card shows up but is still down/disabled.

# modules loaded

[root@n103 ~]# lsmod | grep mlx5
mlx5_ib               397312  0   <---
ib_uverbs             163840  2 rdma_ucm,mlx5_ib
ib_core               397312  12 rdma_cm,ib_ipoib,rpcrdma,ib_srpt,iw_cm,ib_iser,ib_umad,ib_isert,rdma_ucm,ib_uverbs,mlx5_ib,ib_cm
mlx5_core            1703936  1 mlx5_ib  <----
mlxfw                  32768  1 mlx5_core
pci_hyperv_intf        16384  1 mlx5_core
tls                   110592  1 mlx5_core
psample                20480  1 mlx5_core

# cable and power

[root@n103 ~]# dmesg | grep -i mlx5
[    3.097033] mlx5_core 0000:31:00.0: firmware version: 16.34.1002
[    3.097071] mlx5_core 0000:31:00.0: 126.016 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x16 link)  <----
[    3.505586] mlx5_core 0000:31:00.0: Rate limit: 127 rates are supported, range: 0Mbps to 97656Mbps
[    3.505893] mlx5_core 0000:31:00.0: E-Switch: Total vports 2, per vport: max uc(128) max mc(2048)
[    3.510968] mlx5_core 0000:31:00.0: Port module event: module 0, Cable plugged  <----
[    3.511247] mlx5_core 0000:31:00.0: mlx5_pcie_event:300:(pid 818): PCIe slot advertised sufficient power (75W).   <----
[    3.550721] mlx5_core 0000:31:00.0: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
[    3.788710] mlx5_core 0000:31:00.0: Supported tc offload range - chains: 4294967294, prios: 4294967295
[   16.530223] mlx5_core 0000:31:00.0 eth2: Link down

# in order to enable infiniband read on and find solution...

# download mft, stored on astrostore:/usr/local/src
https://network.nvidia.com/products/adapter-software/firmware-tools/

# copy from n102:/usr/local/src the mft-4.23-rpms
# collected from a previous installation
​
cd mft-4.23-rpms/
[root@n103 mft-4.23-rpms]# ls
annobin-10.67-3.el8.x86_64.rpm                make-4.2.1-11.el8.x86_64.rpm
dwz-0.12-10.el8.x86_64.rpm                    ocaml-srpm-macros-5-4.el8.noarch.rpm
efi-srpm-macros-3-3.el8.noarch.rpm            openblas-srpm-macros-2-2.el8.noarch.rpm
elfutils-0.187-4.el8.x86_64.rpm               patch-2.7.6-11.el8.x86_64.rpm
elfutils-libelf-devel-0.187-4.el8.x86_64.rpm  perl-srpm-macros-1-25.el8.noarch.rpm
gc-7.6.4-3.el8.x86_64.rpm                     python3-rpm-generators-5-7.el8.noarch.rpm
gcc-plugin-annobin-8.5.0-16.el8_7.x86_64.rpm  python3-rpm-macros-3-43.el8.noarch.rpm
gdb-headless-8.2-19.el8.x86_64.rpm            python-rpm-macros-3-43.el8.noarch.rpm
ghc-srpm-macros-1.4.2-7.el8.noarch.rpm        python-srpm-macros-3-43.el8.noarch.rpm
go-srpm-macros-2-17.el8.noarch.rpm            qt5-srpm-macros-5.15.3-1.el8.noarch.rpm
guile-2.0.14-7.el8.x86_64.rpm                 redhat-rpm-config-130-1.el8.noarch.rpm
kernel-devel-4.18.0-425.3.1.el8.x86_64.rpm    rpm-build-4.14.3-24.el8_7.x86_64.rpm
libatomic_ops-7.6.2-3.el8.x86_64.rpm          rust-srpm-macros-5-2.el8.noarch.rpm
libbabeltrace-1.5.4-4.el8.x86_64.rpm          zlib-devel-1.2.11-21.el8_7.x86_64.rpm
libipt-1.6.1-8.el8.x86_64.rpm                 zstd-1.4.4-1.el8.x86_64.rpm

# you need these rpms
rpm -iv rpm-build-4.14.3-24.el8_7.x86_64.rpm elfutils-0.187-4.el8.x86_64.rpm \
     zstd-1.4.4-1.el8.x86_64.rpm python3-rpm-generators-5-7.el8.noarch.rpm
     
# and
rpm -ivh rpm-build kernel-devel # for running kernel version

# gunzip, untar mft tarball
cd /usr/local/src/tmp
cd mft-4.23.0-104-x86_64-rpm/
./install.sh

-I- Removing any old MFT file if exists...
-I- Building the MFT kernel binary RPM...
-I- Installing the MFT RPMs...
Verifying...                          ################################# [100%]
Preparing...                          ################################# [100%]
Updating / installing...
   1:kernel-mft-4.23.0-4.18.0_425.3.1.################################# [100%]
Verifying...                          ################################# [100%]
Preparing...                          ################################# [100%]
Updating / installing...
   1:mft-4.23.0-104                   ################################# [100%]
-I- In order to start mst, please run "mst start".

mst start

Starting MST (Mellanox Software Tools) driver set
Loading MST PCI module - Success
Loading MST PCI configuration module - Success
Create devices
Unloading MST PCI module (unused) - Success 


# you will need
git clone https://github.com/stanford-rc/ibswinfo.git
cd ibswinfo/
scp -p ibswinfo.sh /usr/bin/
cd /usr/bin
ln -s  ibswinfo.sh  ibswinfo

[root@n103 src]# ibstat
CA 'mlx5_0'
        CA type: MT4119
        Number of ports: 1
        Firmware version: 16.34.1002
        Hardware version: 0
        Node GUID: 0x98039b03007045f2
        System image GUID: 0x98039b03007045f2
        Port 1:
                State: Down
                Physical state: Disabled
                Rate: 40
                Base lid: 0
                LMC: 0
                SM lid: 0
                Capability mask: 0x00010000
                Port GUID: 0x9a039bfffe7045f2
                Link layer: Ethernet            <--- this needs to change

# read this post, all the way to the bottom
https://forums.developer.nvidia.com/t/mlx5-0-mlx5-1-down/316392

# useful articles
https://docs.nvidia.com/networking/display/mftv422/using+mlxconfig
https://docs.nvidia.com/networking/display/mftv4180/compilation+and+installation


rpm -qf /usr/bin/mlxconfig
mft-4.23.0-104.x86_64

# now configure port on new Mellanox card, first a query

mlxconfig -d /dev/mst/mt4119_pciconf0 q

Device #1:
----------

Device type:    ConnectX5      
Name:           MCX555A-ECA_Ax_Bx
Description:    ConnectX-5 VPI adapter card; EDR IB (100Gb/s) and 100GbE; single-port QSFP28; PCIe3.0 x16; tall bracket; ROHS R6
Device:         /dev/mst/mt4119_pciconf0

Configurations:                                      Next Boot
         MEMIC_BAR_SIZE                              0              
         MEMIC_SIZE_LIMIT                            _256KB(1)      
         HOST_CHAINING_MODE                          DISABLED(0)    
         HOST_CHAINING_CACHE_DISABLE                 False(0)       
         HOST_CHAINING_DESCRIPTORS                   Array[0..7]    
         HOST_CHAINING_TOTAL_BUFFER_SIZE             Array[0..7]    
         FLEX_PARSER_PROFILE_ENABLE                  0              
         FLEX_IPV4_OVER_VXLAN_PORT                   0              
         ROCE_NEXT_PROTOCOL                          254            
         ESWITCH_HAIRPIN_DESCRIPTORS                 Array[0..7]    
         ESWITCH_HAIRPIN_TOT_BUFFER_SIZE             Array[0..7]    
         PF_BAR2_SIZE                                0              
         PF_NUM_OF_VF_VALID                          False(0)       
         NON_PREFETCHABLE_PF_BAR                     False(0)       
         VF_VPD_ENABLE                               False(0)       
         PF_NUM_PF_MSIX_VALID                        False(0)       
         PER_PF_NUM_SF                               False(0)       
         STRICT_VF_MSIX_NUM                          False(0)       
         VF_NODNIC_ENABLE                            False(0)       
         NUM_PF_MSIX_VALID                           True(1)        
         NUM_OF_VFS                                  0              
         NUM_OF_PF                                   1              
         PF_BAR2_ENABLE                              False(0)       
         SRIOV_EN                                    False(0)       
         PF_LOG_BAR_SIZE                             5              
         VF_LOG_BAR_SIZE                             1              
         NUM_PF_MSIX                                 63             
         NUM_VF_MSIX                                 19             
         INT_LOG_MAX_PAYLOAD_SIZE                    AUTOMATIC(0)   
         PCIE_CREDIT_TOKEN_TIMEOUT                   0              
         ACCURATE_TX_SCHEDULER                       False(0)       
         PARTIAL_RESET_EN                            False(0)       
         SW_RECOVERY_ON_ERRORS                       False(0)       
         RESET_WITH_HOST_ON_ERRORS                   False(0)       
         ADVANCED_POWER_SETTINGS                     False(0)       
         CQE_COMPRESSION                             BALANCED(0)    
         IP_OVER_VXLAN_EN                            False(0)       
         MKEY_BY_NAME                                False(0)       
         ESWITCH_IPV4_TTL_MODIFY_ENABLE              False(0)       
         PRIO_TAG_REQUIRED_EN                        False(0)       
         UCTX_EN                                     True(1)        
         PCI_ATOMIC_MODE                             PCI_ATOMIC_DISABLED_EXT_ATOMIC_ENABLED(0)
         TUNNEL_ECN_COPY_DISABLE                     False(0)       
         LRO_LOG_TIMEOUT0                            6              
         LRO_LOG_TIMEOUT1                            7              
         LRO_LOG_TIMEOUT2                            8              
         LRO_LOG_TIMEOUT3                            13             
         LOG_TX_PSN_WINDOW                           7              
         LOG_MAX_OUTSTANDING_WQE                     7              
         ROCE_ADAPTIVE_ROUTING_EN                    False(0)       
         TUNNEL_IP_PROTO_ENTROPY_DISABLE             False(0)       
         ICM_CACHE_MODE                              DEVICE_DEFAULT(0)
         TX_SCHEDULER_BURST                          0              
         ZERO_TOUCH_TUNING_ENABLE                    False(0)       
         LOG_MAX_QUEUE                               17             
         LOG_DCR_HASH_TABLE_SIZE                     11             
         MAX_PACKET_LIFETIME                         0              
         DCR_LIFO_SIZE                               16384          
         LINK_TYPE_P1                                ETH(2)         
         ROCE_CC_PRIO_MASK_P1                        255            
         CLAMP_TGT_RATE_AFTER_TIME_INC_P1            True(1)        
         CLAMP_TGT_RATE_P1                           False(0)       
         RPG_TIME_RESET_P1                           300            
         RPG_BYTE_RESET_P1                           32767          
         RPG_THRESHOLD_P1                            1              
         RPG_MAX_RATE_P1                             0              
         RPG_AI_RATE_P1                              5              
         RPG_HAI_RATE_P1                             50             
         RPG_GD_P1                                   11             
         RPG_MIN_DEC_FAC_P1                          50             
         RPG_MIN_RATE_P1                             1              
         RATE_TO_SET_ON_FIRST_CNP_P1                 0              
         DCE_TCP_G_P1                                1019           
         DCE_TCP_RTT_P1                              1              
         RATE_REDUCE_MONITOR_PERIOD_P1               4              
         INITIAL_ALPHA_VALUE_P1                      1023           
         MIN_TIME_BETWEEN_CNPS_P1                    4              
         CNP_802P_PRIO_P1                            6              
         CNP_DSCP_P1                                 48             
         LLDP_NB_DCBX_P1                             False(0)       
         LLDP_NB_RX_MODE_P1                          OFF(0)         
         LLDP_NB_TX_MODE_P1                          OFF(0)         
         DCBX_IEEE_P1                                True(1)        
         DCBX_CEE_P1                                 True(1)        
         DCBX_WILLING_P1                             True(1)        
         KEEP_ETH_LINK_UP_P1                         True(1)        
         KEEP_IB_LINK_UP_P1                          False(0)       
         KEEP_LINK_UP_ON_BOOT_P1                     False(0)       
         KEEP_LINK_UP_ON_STANDBY_P1                  False(0)       
         DO_NOT_CLEAR_PORT_STATS_P1                  False(0)       
         AUTO_POWER_SAVE_LINK_DOWN_P1                False(0)       
         NUM_OF_VL_P1                                _4_VLs(3)      
         NUM_OF_TC_P1                                _8_TCs(0)      
         NUM_OF_PFC_P1                               8              
         VL15_BUFFER_SIZE_P1                         0              
         DUP_MAC_ACTION_P1                           LAST_CFG(0)    
         MPFS_MC_LOOPBACK_DISABLE_P1                 False(0)       
         MPFS_UC_LOOPBACK_DISABLE_P1                 False(0)       
         UNKNOWN_UPLINK_MAC_FLOOD_P1                 False(0)       
         SRIOV_IB_ROUTING_MODE_P1                    LID(1)         
         IB_ROUTING_MODE_P1                          LID(1)         
         PHY_AUTO_NEG_P1                             DEVICE_DEFAULT(0)
         PHY_RATE_MASK_OVERRIDE_P1                   False(0)       
         PHY_FEC_OVERRIDE_P1                         DEVICE_DEFAULT(0)
         PF_TOTAL_SF                                 0              
         PF_SF_BAR_SIZE                              0              
         PF_NUM_PF_MSIX                              63             
         ROCE_CONTROL                                ROCE_ENABLE(2) 
         PCI_WR_ORDERING                             per_mkey(0)    
         MULTI_PORT_VHCA_EN                          False(0)       
         PORT_OWNER                                  True(1)        
         ALLOW_RD_COUNTERS                           True(1)        
         RENEG_ON_CHANGE                             True(1)        
         TRACER_ENABLE                               True(1)        
         IP_VER                                      IPv4(0)        
         BOOT_UNDI_NETWORK_WAIT                      0              
         UEFI_HII_EN                                 True(1)        
         BOOT_DBG_LOG                                False(0)       
         UEFI_LOGS                                   DISABLED(0)    
         BOOT_VLAN                                   1              
         LEGACY_BOOT_PROTOCOL                        PXE(1)         
         BOOT_RETRY_CNT                              NONE(0)        
         BOOT_INTERRUPT_DIS                          False(0)       
         BOOT_LACP_DIS                               True(1)        
         BOOT_VLAN_EN                                False(0)       
         BOOT_PKEY                                   0              
         P2P_ORDERING_MODE                           DEVICE_DEFAULT(0)
         ATS_ENABLED                                 False(0)       
         DYNAMIC_VF_MSIX_TABLE                       False(0)       
         EXP_ROM_UEFI_x86_ENABLE                     False(0)       
         EXP_ROM_PXE_ENABLE                          True(1)        
         IBM_TUNNELED_ATOMIC_EN                      False(0)       
         IBM_AS_NOTIFY_EN                            False(0)       
         ADVANCED_PCI_SETTINGS                       False(0)       
         SAFE_MODE_THRESHOLD                         10             
         SAFE_MODE_ENABLE                            True(1)        

# switch to port from ethernet to infiniband
mlxconfig -d /dev/mst/mt4119_pciconf0 set LINK_TYPE_P1=1

Device #1:
----------

Device type:    ConnectX5      
Name:           MCX555A-ECA_Ax_Bx
Description:    ConnectX-5 VPI adapter card; EDR IB (100Gb/s) and 100GbE; single-port QSFP28; PCIe3.0 x16; tall bracket; ROHS R6
Device:         /dev/mst/mt4119_pciconf0

Configurations:                                      Next Boot       New
         LINK_TYPE_P1                                ETH(2)          IB(1)          

 Apply new Configuration? (y/n) [n] : y
Applying... Done!
-I- Please reboot machine to load new configurations.

reboot

[root@n103 ~]# ibstat
CA 'mlx5_0'
        CA type: MT4119
        Number of ports: 1
        Firmware version: 16.34.1002
        Hardware version: 0
        Node GUID: 0x98039b03007045f2
        System image GUID: 0x98039b03007045f2
        Port 1:
                State: Active
                Physical state: LinkUp
                Rate: 100
                Base lid: 9
                LMC: 0
                SM lid: 1
                Capability mask: 0xa659e848
                Port GUID: 0x98039b03007045f2
                Link layer: InfiniBand             <----- yea

ib0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 2044
        inet 10.11.103.103  netmask 255.255.0.0  broadcast 0.0.0.0
Infiniband hardware address can be incorrect! Please read BUGS section in ifconfig(8).
        infiniband 00:00:00:43:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00  txqueuelen 256  (InfiniBand)
        RX packets 210  bytes 27332 (26.6 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 228  bytes 17944 (17.5 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

# mount the NFSoRDMA file system

10.11.103.243:/lvm_data /astrostore             nfs     defaults,proto=rdma,port=20049    0 0

# reload this module in /etc/rc.local before ip map to ib0

/usr/sbin/modprobe -r mlx5_ib
/usr/sbin/modprobe  mlx5_ib
sleep 10
/usr/sbin/ip addr add 10.11.103.103/16 dev ib0


Back