\\ **[[cluster:0|Home]]** //ok, so this story begins with ... i thought i had met my inability to comprehend new technology when i was shown that disks can run multiple raid levels simultaneously. but this multipathing eclipses that. just weird, therefore worth describing.// --- //[[hmeij@wesleyan.edu|Henk Meij]] 2007/05/05 14:14// ===== The Problem ===== Our new NetApp FAS 3050c device [[http://www.pcquest.com/content/web/2006/106022211.asp|Read About It]] has some new setups. Although we encountered these problems before we were able to fix it by applying a label to the partition filesystem. Well, this is totally different. Again, i got started on this since gracious Dell installed not one but two HBA cards into the ionode. They must be given us hidden signals. So the HBA cards were connected via fiber cable to network switch #3 and #4, and you guessed it, filer3 and filer 4 are connected respectively. What's new is that both filers share the same [[http://en.wikipedia.org/wiki/WWPN|WWPN]] (World Wide Port Name, so wikipedia isn't always useful ...). That means when a linux clients sends a query to the filer, both answer. For example, i created 3 LUNs on a 26*500 GB disk raid group attached to filer3 only! sda is the local disk. sdb, sdc and sdd are my LUNs. and so are sde, sdf and sdg. * That's good because we can use this multipath to route traffic via two fiber channels to the filers. * That's bad because we may end up with corruption and confusion. [root@ionode-1 ~]# fdisk -l Disk /dev/sda: 79.4 GB, 79456894976 bytes 255 heads, 63 sectors/track, 9660 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sda1 1 9 72261 de Dell Utility /dev/sda2 * 10 1284 10241437+ 83 Linux /dev/sda3 1285 1794 4096575 82 Linux swap /dev/sda4 1795 9660 63183645 5 Extended /dev/sda5 1795 9660 63183613+ 83 Linux Disk /dev/sdb: 1099.5 GB, 1099529453568 bytes 255 heads, 63 sectors/track, 133676 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk /dev/sdb doesn't contain a valid partition table Disk /dev/sdc: 104 MB, 104857600 bytes 4 heads, 50 sectors/track, 1024 cylinders Units = cylinders of 200 * 512 = 102400 bytes Disk /dev/sdc doesn't contain a valid partition table Disk /dev/sdd: 104 MB, 104857600 bytes 4 heads, 50 sectors/track, 1024 cylinders Units = cylinders of 200 * 512 = 102400 bytes Disk /dev/sdd doesn't contain a valid partition table Disk /dev/sde: 1099.5 GB, 1099529453568 bytes 255 heads, 63 sectors/track, 133676 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk /dev/sde doesn't contain a valid partition table Disk /dev/sdf: 104 MB, 104857600 bytes 4 heads, 50 sectors/track, 1024 cylinders Units = cylinders of 200 * 512 = 102400 bytes Disk /dev/sdf doesn't contain a valid partition table Disk /dev/sdg: 104 MB, 104857600 bytes 4 heads, 50 sectors/track, 1024 cylinders Units = cylinders of 200 * 512 = 102400 bytes Disk /dev/sdg doesn't contain a valid partition table ===== Multipath LUNs ===== We install on the ionode the appropriate package, so ... [root@ionode-1 qlafc-linux-8.01.04-3-install]# rpm -q device-mapper device-mapper-1.02.02-3.0.RHEL4 device-mapper-1.02.02-3.0.RHEL4 1041 up2date --get device-mapper-multipath 1043 ls /var/spool/up2date/ =>heck where did it go? 1044 find / -name 'device-mapper-multipath-0.4.5-21.RHEL4.x86_64.rpm' /state/partition1/home/install/ftp.rocksclusters.org/pub/rocks/rocks-4.1.1/rocks-dist\ /rolls/updates/4.1.1/x86_64/RedHat/RPMS/device-mapper-multipath-0.4.5-21.RHEL4.x86_64.rpm => oh. Guess OCS patched up2date, so i moved the RPM to /share/apps/filer, and ... [root@ionode-1 filer]# rpm -ivh device-mapper-multipath-0.4.5-21.RHEL4.x86_64.rpm Preparing... ########################################### [100%] 1:device-mapper-multipath########################################### [100%] NetApp wants to use the HBA vendor drivers, not the Redhat version, after downloading ... [root@ionode-1 filer]# cd qlogic-stuff/ [root@ionode-1 qlogic-stuff]# cd qlafc-linux-8.01.04-3-install/ [root@ionode-1 qlafc-linux-8.01.04-3-install]# ./qlinstall #*********************************************************# # SANsurfer Driver Installer for Linux # # Installer Version: 1.01.00pre6 # #*********************************************************# Kernel version: 2.6.9-34.ELsmp Distribution: Red Hat Enterprise Linux WS release 4 (Nahant Update 3) Found QLogic Fibre Channel Adapter in the system 1. QLE2460 Installation will begin for following driver 1. qla2xxx version: v8.01.04 Preparing... ################################################## qla2xxx ################################################## QLA2XXX -- Building the qla2xxx driver... QLA2XXX -- Installing the qla2xxx modules to /lib/modules/2.6.9-34.ELsmp/kernel/drivers/scsi/qla2xxx/... Setting up QLogic HBA API library... Please make sure the /usr/lib/libqlsdm.so file is not in use. Installing 32bit api binary for x86_64. Installing 64bit api binary for x86_64. Done. Unloading any loaded drivers Unloaded module qla2400 Loading module qla2xxx_conf version: v8.01.04.... Loaded module qla2xxx_conf Loading module qla2xxx version: v8.01.04.... Loaded module qla2xxx Loading module qla2400 version: v8.01.04.... Loaded module qla2400 Installing scli.... Preparing... ################################################## scli ################################################## Installation completed successfully. Building default persistent binding using SCLI Info: No devices found on HBA port 0. Skipping target persistent binding configuration. Info: No devices found on HBA port 1. Skipping target persistent binding configuration. Saved copy of /etc/modprobe.conf as /usr/src/qlogic/v8.01.04-3/backup/modprobe.conf-2.6.9-34.ELsmp-050407-142853.bak Saved copy of /boot/initrd-2.6.9-34.ELsmp.img as /usr/src/qlogic/v8.01.04-3/backup/initrd-2.6.9-34.ELsmp.img-050407-142853.bak QLA2XXX -- Rebuilding ramdisk image... Ramdisk created. Reloading the QLogic FC HBA drivers.... Unloaded module qla2400 Loading module qla2xxx_conf version: v8.01.04.... Loaded module qla2xxx_conf Loading module qla2xxx version: v8.01.04.... Loaded module qla2xxx Loading module qla2400 version: v8.01.04.... Loaded module qla2400 tee: ql_device_info: Permission denied Target Information on all HBAs: ============================== ----------------------------------------------------------------------------- HBA Port 0 - QLE2460 Port Name: 21-00-00-E0-8B-93-AA-3A Port ID: 00-00-00 ----------------------------------------------------------------------------- Info: The selected adapter has no attached devices (HBA port 0)! ----------------------------------------------------------------------------- HBA Port 1 - QLE2460 Port Name: 21-00-00-E0-8B-93-AC-57 Port ID: 00-00-00 ----------------------------------------------------------------------------- Info: The selected adapter has no attached devices (HBA port 1)! #*********************************************************# # INSTALLATION SUCCESSFUL!! # # SANsurfer Driver installation for Linux completed # #*********************************************************# Then load ''scli'' and reconfigure HBA cards, then rebuild ... follow the netapp documentation and don't forget to commit changes to the cards. [root@ionode-1 qlafc-linux-8.01.04-3-install]# ./qlinstall -l qla2400 Loading module qla2400 version: v8.01.04.... Loaded module qla2400 [root@ionode-1 qlafc-linux-8.01.04-3-install]# ./qlinstall -br -in qla2400 Saved copy of /etc/modprobe.conf as /usr/src/qlogic/v8.01.04-3/backup/modprobe.conf-2.6.9-34.ELsmp-050407-144528.bak Saved copy of /boot/initrd-2.6.9-34.ELsmp.img as /usr/src/qlogic/v8.01.04-3/backup/initrd-2.6.9-34.ELsmp.img-050407-144528.bak QLA2XXX -- Rebuilding ramdisk image... Ramdisk created. [root@ionode-1 qlafc-linux-8.01.04-3-install]# chkconfig --add multipathd [root@ionode-1 qlafc-linux-8.01.04-3-install]# chkconfig multipathd on [root@ionode-1 qlafc-linux-8.01.04-3-install]# reboot After the reboot we have not solved our problem. But we do have 2 functioning HBA cards. Some configuration changes have been applied to the cards. One specifically, after how long a time out should a request be routed to the other card (not shown, but detailed in the NetApp documentation, see /share/app/filer). [root@ionode-1 ~]# /usr/local/bin/scli -i ----------------------------------------------------------------------------- Host Name : ionode-1.local HBA Model : QLE2460 Port : 0 Node Name : 20-00-00-E0-8B-93-AA-3A Port Name : 21-00-00-E0-8B-93-AA-3A Port ID : 61-16-13 Serial Number : RFC0644M60202 Driver Version : 8.01.04 FCode Version : 1.13 Firmware Version : 4.00.18 OptionROM BIOS Version : 1.08 OptionROM FCode Version : 1.13 OptionROM EFI Version : 1.02 OptionROM Firmware Version : 4.00.12 Actual Connection Mode : Point to Point Actual Data Rate : 4 Gbps PortType (Topology) : FPort Device Target Count : 0 HBA Status : Online ----------------------------------------------------------------------------- Host Name : ionode-1.local HBA Model : QLE2460 Port : 1 Node Name : 20-00-00-E0-8B-93-AC-57 Port Name : 21-00-00-E0-8B-93-AC-57 Port ID : 61-0E-13 Serial Number : RFC0645M67628 Driver Version : 8.01.04 FCode Version : 1.13 Firmware Version : 4.00.18 OptionROM BIOS Version : 1.08 OptionROM FCode Version : 1.13 OptionROM EFI Version : 1.02 OptionROM Firmware Version : 4.00.12 Actual Connection Mode : Point to Point Actual Data Rate : 4 Gbps PortType (Topology) : FPort Device Target Count : 4 HBA Status : Online -------------------------------------------------------------------------- The problem. [root@ionode-1 ~]# sanlun lun show all filer: lun-pathname device filename adapter protocol lun size lun state filer3: /vol/cluster_home/users /dev/sdi host2 FCP 1.0t (1099529453568) GOOD filer3: /vol/cluster_home/users /dev/sdg host2 FCP 1.0t (1099529453568) GOOD filer3: /vol/cluster_home/users /dev/sde host1 FCP 1.0t (1099529453568) GOOD filer3: /vol/cluster_home/fstarr /dev/sdh host2 FCP 1.0t (1099529453568) GOOD filer3: /vol/cluster_home/fstarr /dev/sdj host2 FCP 1.0t (1099529453568) GOOD filer3: /vol/cluster_home/fstarr /dev/sdf host1 FCP 1.0t (1099529453568) GOOD filer3: /vol/cluster_scratch/lun0 /dev/sdd host2 FCP 1.0t (1099529453568) GOOD filer3: /vol/cluster_scratch/lun0 /dev/sdc host2 FCP 1.0t (1099529453568) GOOD filer3: /vol/cluster_scratch/lun0 /dev/sdb host1 FCP 1.0t (1099529453568) GOOD [root@ionode-1 ~]# multipath -v3 -d -ll # # all paths : # 2:0:2:0 sdb 8:16 [ready] NETAPP /LUN /0.2 2:0:2:1 sdc 8:32 [ready] NETAPP /LUN /0.2 2:0:2:2 sdd 8:48 [ready] NETAPP /LUN /0.2 2:0:3:0 sde 8:64 [ready] NETAPP /LUN /0.2 2:0:3:1 sdf 8:80 [ready] NETAPP /LUN /0.2 2:0:3:2 sdg 8:96 [ready] NETAPP /LUN /0.2 Now for some magic. We edit /etc/multipath.conf and ... * add a definition for the NetApp device * blacklist our local disk sda * add a parameter specifying when to switch path (based on size of single I/O operation that blocks waiting for it to complete * and reboot Now we see: params = 1 queue_if_no_path 0 2 1 round-robin 0 1 1 8:112 500 round-robin 0 2 1 8:80 500 8:144 500 status = 1 0 0 2 1 A 0 1 0 8:112 A 0 E 0 2 0 8:80 A 0 8:144 A 0 mpath2 (360a9800043346d375a6f41794a597852) [size=1024 GB][features="1 queue_if_no_path"][hwhandler="0"] \_ round-robin 0 [active] \_ 2:0:2:2 sdh 8:112 [active] \_ round-robin 0 [enabled] \_ 1:0:2:2 sdf 8:80 [active] \_ 2:0:3:2 sdj 8:144 [active] params = 1 queue_if_no_path 0 2 1 round-robin 0 1 1 8:96 500 round-robin 0 2 1 8:64 500 8:128 500 status = 1 0 0 2 1 A 0 1 0 8:96 A 0 E 0 2 0 8:64 A 0 8:128 A 0 mpath1 (360a9800043346d375a6f41794a576176) [size=1024 GB][features="1 queue_if_no_path"][hwhandler="0"] \_ round-robin 0 [active] \_ 2:0:2:1 sdg 8:96 [active] \_ round-robin 0 [enabled] \_ 1:0:2:1 sde 8:64 [active] \_ 2:0:3:1 sdi 8:128 [active] params = 1 queue_if_no_path 0 2 1 round-robin 0 1 1 8:32 500 round-robin 0 2 1 8:16 500 8:48 500 status = 1 133034 0 2 1 E 0 1 0 8:32 F 1194 E 0 2 0 8:16 F 1195 8:48 F 1194 mpath0 (360a9800043346d375a6f41744c635563) [size=1024 GB][features="1 queue_if_no_path"][hwhandler="0"] \_ round-robin 0 [enabled] \_ 2:0:2:0 sdc 8:32 [failed] \_ round-robin 0 [enabled] \_ 1:0:2:0 sdb 8:16 [failed] \_ 2:0:3:0 sdd 8:48 [failed] This magic hoopla is achieved because something is creating these devices for us, and the manual states |''The /dev/mapper devices are persistent across reboots, but the /dev/sdx devices are not. After a reboot, or a restart of the HBA driver, you might find that different /dev/sdx devices make up a given /dev/mapper device. The /dev/mapper device, however, will always correspond to the same LUN.''| [root@ionode-1 ~]# ls -l /dev/mapper/ total 0 crw------- 1 root root 10, 63 May 5 12:29 control brw-rw---- 1 root disk 253, 0 May 5 12:29 mpath0 brw-rw---- 1 root disk 253, 1 May 5 12:29 mpath1 brw-rw---- 1 root disk 253, 2 May 5 12:29 mpath2 Eureka. Throw a filesystem on it (no need for a partition table apparently). [root@ionode-1 ~]# mkfs -t ext3 /dev/mapper/mpath0 mke2fs 1.35 (28-Feb-2004) Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) 134234112 inodes, 268439808 blocks 13421990 blocks (5.00%) reserved for the super user First data block=0 Maximum filesystem blocks=4294967296 8193 block groups 32768 blocks per group, 32768 fragments per group 16384 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968, 102400000, 214990848 Writing inode tables: done Creating journal (8192 blocks): done Writing superblocks and filesystem accounting information: done This filesystem will be automatically checked every 20 mounts or 180 days, whichever comes first. Use tune2fs -c or -i to override. And now we can mount and export to other nodes via /etc/fstab and /etc/exportfs. /dev/mapper/mpath0 /sanscratch ext3 defaults 0 0 /sanscratch 10.3.1.0/255.255.255.0(rw,sync,no_root_squash) There are some linux tuneable parameters, look at header in ionode-1:/etc/multipath.conf. Yea. Done. And now we have multiple paths from the ionode to the filers (how they work that out is beyond me) and two HBA cards that also will take over from each other. to rediscover or discover new LUNs /opt/netapp/santools/qla2xxx_lun_rescan all make sure the HBA cards have logged into the filer, ssh into filer3 filer3> igroup show swallowtail (FCP) (ostype: linux): 21:00:00:e0:8b:93:ac:57 (logged in on: vtic, 0c) 21:00:00:e0:8b:93:aa:3a (logged in on: vtic, 0d) --- //[[hmeij@wesleyan.edu|Meij, Henk]] 2008/03/17 14:42// The bindings (mappings) are stored here ''/var/lib/multipath/bindings'' # Multipath bindings, Version : 1.0 # NOTE: this file is automatically maintained by the multipath program. # You should not need to edit this file in normal circumstances. # # Format: # alias wwid # mpath0 360a9800043346d375a6f41794a597852 mpath1 360a9800043346d375a6f4237427a316d mpath2 360a9800043346d375a6f423743307771 mpath3 360a9800043346d375a6f423743335673 mpath4 360a9800043346d375a6f423743375578 mpath5 360a9800043346d375a6f423743394379 mpath6 360a9800043346d375a6f423837674970 mpath7 360a9800043346d375a6f4238394f516c mpath8 360a9800043346d375a6f423839536b52 mpath9 360a9800043346d375a6f423838394b71 mpath10 360a9800043346d375a6f424566792f32 mpath11 360a9800043346d375a6f41794a576176 ===== Oh baby ===== What next? Oh yea ... * perform extensive read/writes and fsck * disable an HBA card and observe while reading/writing * disable a filer (filer3!) and observe if filer4 takes over (without having the raid group local!) Monday is "play at work" day. \\ **[[cluster:0|Home]]**