User Tools

Site Tools


cluster:36

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

cluster:36 [2008/03/17 14:44] (current)
Line 1: Line 1:
 +\\
 +**[[cluster:​0|Home]]**
  
 +//ok, so this story begins with ... i thought i had met my inability to comprehend new technology when i was shown that disks can run multiple raid levels simultaneously. but this multipathing eclipses that. just weird, therefore worth describing.//​
 + --- //​[[hmeij@wesleyan.edu|Henk Meij]] 2007/05/05 14:14//
 +
 +===== The Problem =====
 +
 + Our new NetApp FAS 3050c device [[http://​www.pcquest.com/​content/​web/​2006/​106022211.asp|Read About It]] has some new setups. ​ Although we encountered these problems before we were able to fix it by applying a label to the partition filesystem.
 +
 +Well, this is totally different. ​ Again, i got started on this since gracious Dell installed not one but two HBA cards into the ionode. They must be given us hidden signals. ​ So the HBA cards were connected via fiber cable to network switch #3 and #4, and you guessed it, filer3 and filer 4 are connected respectively.
 +
 +What's new is that both filers share the same [[http://​en.wikipedia.org/​wiki/​WWPN|WWPN]] (World Wide Port Name, so wikipedia isn't always useful ...).  That means when a linux clients sends a query to the filer, both answer.
 +
 +For example, i created 3 LUNs on a 26*500 GB disk raid group attached to filer3 only! sda is the local disk.  sdb, sdc and sdd are my LUNs.  and so are sde, sdf and sdg.  ​
 +
 +  * That's good because we can use this multipath to route traffic via two fiber channels to the filers.
 +  * That's bad because we may end up with corruption and confusion. ​
 +
 +<​code>​
 +[root@ionode-1 ~]# fdisk -l
 +
 +Disk /dev/sda: 79.4 GB, 79456894976 bytes
 +255 heads, 63 sectors/​track,​ 9660 cylinders
 +Units = cylinders of 16065 * 512 = 8225280 bytes
 +
 +   ​Device Boot      Start         ​End ​     Blocks ​  ​Id ​ System
 +/​dev/​sda1 ​              ​1 ​          ​9 ​      ​72261 ​  ​de ​ Dell Utility
 +/​dev/​sda2 ​  ​* ​         10        1284    10241437+ ​ 83  Linux
 +/​dev/​sda3 ​           1285        1794     ​4096575 ​  ​82 ​ Linux swap
 +/​dev/​sda4 ​           1795        9660    63183645 ​   5  Extended
 +/​dev/​sda5 ​           1795        9660    63183613+ ​ 83  Linux
 +
 +Disk /dev/sdb: 1099.5 GB, 1099529453568 bytes
 +255 heads, 63 sectors/​track,​ 133676 cylinders
 +Units = cylinders of 16065 * 512 = 8225280 bytes
 +
 +Disk /dev/sdb doesn'​t contain a valid partition table
 +
 +Disk /dev/sdc: 104 MB, 104857600 bytes
 +4 heads, 50 sectors/​track,​ 1024 cylinders
 +Units = cylinders of 200 * 512 = 102400 bytes
 +
 +Disk /dev/sdc doesn'​t contain a valid partition table
 +
 +Disk /dev/sdd: 104 MB, 104857600 bytes
 +4 heads, 50 sectors/​track,​ 1024 cylinders
 +Units = cylinders of 200 * 512 = 102400 bytes
 +
 +Disk /dev/sdd doesn'​t contain a valid partition table
 +
 +Disk /dev/sde: 1099.5 GB, 1099529453568 bytes
 +255 heads, 63 sectors/​track,​ 133676 cylinders
 +Units = cylinders of 16065 * 512 = 8225280 bytes
 +
 +Disk /dev/sde doesn'​t contain a valid partition table
 +
 +Disk /dev/sdf: 104 MB, 104857600 bytes
 +4 heads, 50 sectors/​track,​ 1024 cylinders
 +Units = cylinders of 200 * 512 = 102400 bytes
 +
 +Disk /dev/sdf doesn'​t contain a valid partition table
 +
 +Disk /dev/sdg: 104 MB, 104857600 bytes
 +4 heads, 50 sectors/​track,​ 1024 cylinders
 +Units = cylinders of 200 * 512 = 102400 bytes
 +
 +Disk /dev/sdg doesn'​t contain a valid partition table
 +</​code>​
 +
 +
 +
 +
 +
 +
 +===== Multipath LUNs =====
 +
 +We install on the ionode the appropriate package, so ...
 +
 +<​code>​
 +[root@ionode-1 qlafc-linux-8.01.04-3-install]#​ rpm -q device-mapper
 +device-mapper-1.02.02-3.0.RHEL4
 +device-mapper-1.02.02-3.0.RHEL4
 +</​code>​
 +
 +<​code>​
 + ​1041 ​ up2date --get device-mapper-multipath
 + ​1043 ​ ls /​var/​spool/​up2date/​
 +=>heck where did it go?
 + ​1044 ​ find / -name '​device-mapper-multipath-0.4.5-21.RHEL4.x86_64.rpm'​
 +/​state/​partition1/​home/​install/​ftp.rocksclusters.org/​pub/​rocks/​rocks-4.1.1/​rocks-dist\
 +/​rolls/​updates/​4.1.1/​x86_64/​RedHat/​RPMS/​device-mapper-multipath-0.4.5-21.RHEL4.x86_64.rpm
 +=> oh.
 +</​code>​
 +
 +Guess OCS patched up2date, so i moved the RPM to /​share/​apps/​filer,​ and ...
 +
 +<​code>​
 +[root@ionode-1 filer]# rpm -ivh device-mapper-multipath-0.4.5-21.RHEL4.x86_64.rpm
 +Preparing... ​               ###########################################​ [100%]
 +   ​1:​device-mapper-multipath###########################################​ [100%]
 +</​code>​
 +
 +NetApp wants to use the HBA vendor drivers, not the Redhat version, after downloading ...
 +
 +<​code>​
 +[root@ionode-1 filer]# cd qlogic-stuff/​
 +[root@ionode-1 qlogic-stuff]#​ cd qlafc-linux-8.01.04-3-install/​
 +[root@ionode-1 qlafc-linux-8.01.04-3-install]#​ ./qlinstall
 +
 +#​*********************************************************#​
 +#           ​SANsurfer Driver Installer for Linux          #
 +#               ​Installer Version: ​ 1.01.00pre6 ​          #
 +#​*********************************************************#​
 +
 +Kernel version: 2.6.9-34.ELsmp
 +Distribution:​ Red Hat Enterprise Linux WS release 4 (Nahant Update 3)
 +
 +Found QLogic Fibre Channel Adapter in the system
 +    1. QLE2460
 +Installation will begin for following driver
 +    1. qla2xxx version: v8.01.04
 +
 +Preparing... ​               ##################################################​
 +qla2xxx ​                    ##################################################​
 +
 +QLA2XXX -- Building the qla2xxx driver...
 +
 +QLA2XXX -- Installing the qla2xxx modules to
 +/​lib/​modules/​2.6.9-34.ELsmp/​kernel/​drivers/​scsi/​qla2xxx/​...
 +
 +Setting up QLogic HBA API library...
 +Please make sure the /​usr/​lib/​libqlsdm.so file is not in use.
 +Installing 32bit api binary for x86_64.
 +Installing 64bit api binary for x86_64.
 +Done.
 +
 +Unloading any loaded drivers
 +Unloaded module qla2400
 +Loading module qla2xxx_conf version: v8.01.04....
 +Loaded module qla2xxx_conf
 +Loading module qla2xxx version: v8.01.04....
 +Loaded module qla2xxx
 +Loading module qla2400 version: v8.01.04....
 +Loaded module qla2400
 +Installing scli....
 +Preparing... ​               ##################################################​
 +scli                        ##################################################​
 +Installation completed successfully.
 +
 +Building default persistent binding using SCLI
 +Info: No devices found on HBA port 0. Skipping target persistent
 +binding configuration.
 +Info: No devices found on HBA port 1. Skipping target persistent
 +binding configuration.
 +
 +Saved copy of /​etc/​modprobe.conf as
 +/​usr/​src/​qlogic/​v8.01.04-3/​backup/​modprobe.conf-2.6.9-34.ELsmp-050407-142853.bak
 +
 +Saved copy of /​boot/​initrd-2.6.9-34.ELsmp.img as
 +/​usr/​src/​qlogic/​v8.01.04-3/​backup/​initrd-2.6.9-34.ELsmp.img-050407-142853.bak
 +
 +QLA2XXX -- Rebuilding ramdisk image...
 +Ramdisk created.
 +
 +Reloading the QLogic FC HBA drivers....
 +Unloaded module qla2400
 +Loading module qla2xxx_conf version: v8.01.04....
 +Loaded module qla2xxx_conf
 +Loading module qla2xxx version: v8.01.04....
 +Loaded module qla2xxx
 +Loading module qla2400 version: v8.01.04....
 +Loaded module qla2400
 +tee: ql_device_info:​ Permission denied
 +
 +Target Information on all HBAs:
 +==============================
 +-----------------------------------------------------------------------------
 +HBA Port 0 - QLE2460 ​ Port Name: 21-00-00-E0-8B-93-AA-3A Port ID: 00-00-00
 +-----------------------------------------------------------------------------
 +Info: The selected adapter has no attached devices (HBA port 0)!
 +-----------------------------------------------------------------------------
 +HBA Port 1 - QLE2460 ​ Port Name: 21-00-00-E0-8B-93-AC-57 Port ID: 00-00-00
 +-----------------------------------------------------------------------------
 +Info: The selected adapter has no attached devices (HBA port 1)!
 +
 +#​*********************************************************#​
 +#               ​INSTALLATION SUCCESSFUL!! ​                #
 +#    SANsurfer Driver installation for Linux completed ​   #
 +#​*********************************************************#​
 +
 +</​code>​
 +
 +Then load ''​scli''​ and reconfigure HBA cards, then rebuild ... <hi #​ffff00>​follow the netapp documentation and don't forget to commit changes to the cards.</​hi>​
 +
 +<​code>​
 +[root@ionode-1 qlafc-linux-8.01.04-3-install]#​ ./qlinstall -l qla2400
 +Loading module qla2400 version: v8.01.04....
 +Loaded module qla2400
 +[root@ionode-1 qlafc-linux-8.01.04-3-install]#​ ./qlinstall -br -in qla2400
 +
 +Saved copy of /​etc/​modprobe.conf as
 +/​usr/​src/​qlogic/​v8.01.04-3/​backup/​modprobe.conf-2.6.9-34.ELsmp-050407-144528.bak
 +
 +Saved copy of /​boot/​initrd-2.6.9-34.ELsmp.img as
 +/​usr/​src/​qlogic/​v8.01.04-3/​backup/​initrd-2.6.9-34.ELsmp.img-050407-144528.bak
 +
 +QLA2XXX -- Rebuilding ramdisk image...
 +Ramdisk created.
 +
 +[root@ionode-1 qlafc-linux-8.01.04-3-install]#​ chkconfig --add multipathd
 +
 +[root@ionode-1 qlafc-linux-8.01.04-3-install]#​ chkconfig multipathd on
 +
 +[root@ionode-1 qlafc-linux-8.01.04-3-install]#​ reboot
 +</​code>​
 +
 +After the reboot we have not solved our problem. But we do have 2 functioning HBA cards. ​ Some configuration changes have been applied to the cards. ​ One specifically,​ after how long a time out should a request be routed to the other card (not shown, but detailed in the NetApp documentation,​ see /​share/​app/​filer).
 +
 +<​code>​
 +[root@ionode-1 ~]# /​usr/​local/​bin/​scli -i
 +-----------------------------------------------------------------------------
 +Host Name                  : ionode-1.local
 +HBA Model                  : QLE2460
 +Port                       : 0
 +Node Name                  : 20-00-00-E0-8B-93-AA-3A
 +Port Name                  : 21-00-00-E0-8B-93-AA-3A
 +Port ID                    : 61-16-13
 +Serial Number ​             : RFC0644M60202
 +Driver Version ​            : 8.01.04
 +FCode Version ​             : 1.13
 +Firmware Version ​          : 4.00.18
 +OptionROM BIOS Version ​    : 1.08
 +OptionROM FCode Version ​   : 1.13
 +OptionROM EFI Version ​     : 1.02
 +OptionROM Firmware Version : 4.00.12
 +Actual Connection Mode     : Point to Point
 +Actual Data Rate           : 4 Gbps
 +PortType (Topology) ​       : FPort
 +Device Target Count        : 0
 +HBA Status ​                : Online
 +-----------------------------------------------------------------------------
 +Host Name                  : ionode-1.local
 +HBA Model                  : QLE2460
 +Port                       : 1
 +Node Name                  : 20-00-00-E0-8B-93-AC-57
 +Port Name                  : 21-00-00-E0-8B-93-AC-57
 +Port ID                    : 61-0E-13
 +Serial Number ​             : RFC0645M67628
 +Driver Version ​            : 8.01.04
 +FCode Version ​             : 1.13
 +Firmware Version ​          : 4.00.18
 +OptionROM BIOS Version ​    : 1.08
 +OptionROM FCode Version ​   : 1.13
 +OptionROM EFI Version ​     : 1.02
 +OptionROM Firmware Version : 4.00.12
 +Actual Connection Mode     : Point to Point
 +Actual Data Rate           : 4 Gbps
 +PortType (Topology) ​       : FPort
 +Device Target Count        : 4
 +HBA Status ​                : Online
 +--------------------------------------------------------------------------
 +</​code>​
 +
 +The problem.
 +
 +<​code>​
 +[root@ionode-1 ~]# sanlun lun show all
 +  filer: ​         lun-pathname ​       device filename ​ adapter ​ protocol ​         lun size         lun state
 +   ​filer3: ​ /​vol/​cluster_home/​users ​      /​dev/​sdi ​        ​host2 ​   FCP          1.0t (1099529453568) ​ GOOD
 +   ​filer3: ​ /​vol/​cluster_home/​users ​      /​dev/​sdg ​        ​host2 ​   FCP          1.0t (1099529453568) ​ GOOD
 +   ​filer3: ​ /​vol/​cluster_home/​users ​      /​dev/​sde ​        ​host1 ​   FCP          1.0t (1099529453568) ​ GOOD
 +   ​filer3: ​ /​vol/​cluster_home/​fstarr ​     /​dev/​sdh ​        ​host2 ​   FCP          1.0t (1099529453568) ​ GOOD
 +   ​filer3: ​ /​vol/​cluster_home/​fstarr ​     /​dev/​sdj ​        ​host2 ​   FCP          1.0t (1099529453568) ​ GOOD
 +   ​filer3: ​ /​vol/​cluster_home/​fstarr ​     /​dev/​sdf ​        ​host1 ​   FCP          1.0t (1099529453568) ​ GOOD
 +   ​filer3: ​ /​vol/​cluster_scratch/​lun0 ​    /​dev/​sdd ​        ​host2 ​   FCP          1.0t (1099529453568) ​ GOOD
 +   ​filer3: ​ /​vol/​cluster_scratch/​lun0 ​    /​dev/​sdc ​        ​host2 ​   FCP          1.0t (1099529453568) ​ GOOD
 +   ​filer3: ​ /​vol/​cluster_scratch/​lun0 ​    /​dev/​sdb ​        ​host1 ​   FCP          1.0t (1099529453568) ​ GOOD
 +
 +
 +
 +[root@ionode-1 ~]# multipath -v3 -d -ll
 +#
 +# all paths :
 +#
 +  2:0:2:0 sdb 8:16  [ready] NETAPP ​ /LUN             /0.2
 +  2:0:2:1 sdc 8:32  [ready] NETAPP ​ /LUN             /0.2
 +  2:0:2:2 sdd 8:48  [ready] NETAPP ​ /LUN             /0.2
 +  2:0:3:0 sde 8:64  [ready] NETAPP ​ /LUN             /0.2
 +  2:0:3:1 sdf 8:80  [ready] NETAPP ​ /LUN             /0.2
 +  2:0:3:2 sdg 8:96  [ready] NETAPP ​ /LUN             /0.2
 +</​code>​
 +
 +Now for some magic. ​ We edit /​etc/​multipath.conf and ... 
 +
 +  * add a definition for the NetApp device
 +  * blacklist our local disk sda
 +  * add a parameter specifying when to switch path (based on size of single I/O operation that blocks waiting for it to complete
 +  * and reboot
 +
 +Now we see:
 +<​code>​
 +params = 1 queue_if_no_path 0 2 1 round-robin 0 1 1 8:112 500 round-robin 0 2 1 8:80 500 8:144 500
 +status = 1 0 0 2 1 A 0 1 0 8:112 A 0 E 0 2 0 8:80 A 0 8:144 A 0
 +mpath2 (360a9800043346d375a6f41794a597852)
 +[size=1024 GB][features="​1 queue_if_no_path"​][hwhandler="​0"​]
 +\_ round-robin 0 [active]
 + \_ 2:0:2:2 sdh 8:112  [active]
 +\_ round-robin 0 [enabled]
 + \_ 1:0:2:2 sdf 8:80   ​[active]
 + \_ 2:0:3:2 sdj 8:144  [active]
 +
 +params = 1 queue_if_no_path 0 2 1 round-robin 0 1 1 8:96 500 round-robin 0 2 1 8:64 500 8:128 500
 +status = 1 0 0 2 1 A 0 1 0 8:96 A 0 E 0 2 0 8:64 A 0 8:128 A 0
 +mpath1 (360a9800043346d375a6f41794a576176)
 +[size=1024 GB][features="​1 queue_if_no_path"​][hwhandler="​0"​]
 +\_ round-robin 0 [active]
 + \_ 2:0:2:1 sdg 8:96   ​[active]
 +\_ round-robin 0 [enabled]
 + \_ 1:0:2:1 sde 8:64   ​[active]
 + \_ 2:0:3:1 sdi 8:128  [active]
 +
 +params = 1 queue_if_no_path 0 2 1 round-robin 0 1 1 8:32 500 round-robin 0 2 1 8:16 500 8:48 500
 +status = 1 133034 0 2 1 E 0 1 0 8:32 F 1194 E 0 2 0 8:16 F 1195 8:48 F 1194
 +mpath0 (360a9800043346d375a6f41744c635563)
 +[size=1024 GB][features="​1 queue_if_no_path"​][hwhandler="​0"​]
 +\_ round-robin 0 [enabled]
 + \_ 2:0:2:0 sdc 8:32   ​[failed]
 +\_ round-robin 0 [enabled]
 + \_ 1:0:2:0 sdb 8:16   ​[failed]
 + \_ 2:0:3:0 sdd 8:48   ​[failed]
 +</​code>​
 +
 +
 +This magic hoopla is achieved because something is creating these devices for us, and the manual states ​
 +
 +|''​The /dev/mapper devices are persistent across reboots, but the /dev/sdx
 +devices are not. After a reboot, or a restart of the HBA driver, you might find that
 +different /dev/sdx devices make up a given /dev/mapper device. The /dev/mapper
 +device, however, will always correspond to the same LUN.''​|
 +
 +<​code>​
 +[root@ionode-1 ~]# ls -l /​dev/​mapper/​
 +total 0
 +crw------- ​ 1 root root  10, 63 May  5 12:29 control
 +brw-rw---- ​ 1 root disk 253,  0 May  5 12:29 mpath0
 +brw-rw---- ​ 1 root disk 253,  1 May  5 12:29 mpath1
 +brw-rw---- ​ 1 root disk 253,  2 May  5 12:29 mpath2
 +</​code>​
 +
 +Eureka.
 +
 +Throw a filesystem on it (no need for a partition table apparently).
 +
 +<​code>​
 +[root@ionode-1 ~]# mkfs -t ext3 /​dev/​mapper/​mpath0
 +mke2fs 1.35 (28-Feb-2004)
 +Filesystem label=
 +OS type: Linux
 +Block size=4096 (log=2)
 +Fragment size=4096 (log=2)
 +134234112 inodes, 268439808 blocks
 +13421990 blocks (5.00%) reserved for the super user
 +First data block=0
 +Maximum filesystem blocks=4294967296
 +8193 block groups
 +32768 blocks per group, 32768 fragments per group
 +16384 inodes per group
 +Superblock backups stored on blocks:
 +        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
 +        4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
 +        102400000, 214990848
 +
 +Writing inode tables: done
 +Creating journal (8192 blocks): done
 +Writing superblocks and filesystem accounting information:​ done
 +
 +This filesystem will be automatically checked every 20 mounts or
 +180 days, whichever comes first. ​ Use tune2fs -c or -i to override.
 +</​code>​
 +
 +And now we can mount and export to other nodes via /etc/fstab and /​etc/​exportfs.
 +
 +<​code>​
 +/​dev/​mapper/​mpath0 ​    /​sanscratch ​   ext3   ​defaults ​  ​0 ​  0
 +</​code>​
 +
 +<​code>​
 +/​sanscratch ​    ​10.3.1.0/​255.255.255.0(rw,​sync,​no_root_squash)
 +</​code>​
 +
 +There are some linux tuneable parameters, look at header in ionode-1:/​etc/​multipath.conf.
 +
 +Yea. Done.
 +
 +And now we have multiple paths from the ionode to the filers (how they work that out is beyond me) and two HBA cards that also will take over from each other. ​
 +
 +to rediscover or discover new LUNs
 +
 +<​code>​
 +/​opt/​netapp/​santools/​qla2xxx_lun_rescan all
 +</​code>​
 +
 +make sure the HBA cards have logged into the filer, ssh into filer3
 +
 +<​code>​
 +filer3> igroup show
 +    swallowtail (FCP) (ostype: linux):
 +        21:​00:​00:​e0:​8b:​93:​ac:​57 (logged in on: vtic, 0c)
 +        21:​00:​00:​e0:​8b:​93:​aa:​3a (logged in on: vtic, 0d)
 +</​code>​
 +
 +
 + --- //​[[hmeij@wesleyan.edu|Meij,​ Henk]] 2008/03/17 14:42//
 +
 +The bindings (mappings) are stored here ''/​var/​lib/​multipath/​bindings''​
 +
 +<​code>​
 +
 +# Multipath bindings, Version : 1.0
 +# NOTE: this file is automatically maintained by the multipath program.
 +# You should not need to edit this file in normal circumstances.
 +#
 +# Format:
 +# alias wwid
 +#
 +mpath0 360a9800043346d375a6f41794a597852
 +mpath1 360a9800043346d375a6f4237427a316d
 +mpath2 360a9800043346d375a6f423743307771
 +mpath3 360a9800043346d375a6f423743335673
 +mpath4 360a9800043346d375a6f423743375578
 +mpath5 360a9800043346d375a6f423743394379
 +mpath6 360a9800043346d375a6f423837674970
 +mpath7 360a9800043346d375a6f4238394f516c
 +mpath8 360a9800043346d375a6f423839536b52
 +mpath9 360a9800043346d375a6f423838394b71
 +mpath10 360a9800043346d375a6f424566792f32
 +mpath11 360a9800043346d375a6f41794a576176
 +
 +</​code>​
 +
 +===== Oh baby =====
 +
 +What next?  Oh yea ...
 +
 +  * perform extensive read/writes and fsck
 +  * disable an HBA card and observe while reading/​writing
 +  * disable a filer (filer3!) and observe if filer4 takes over (without having the raid group local!)
 +
 +Monday is "play at work" day.
 +
 +
 +
 +
 +
 +
 +
 +\\
 +**[[cluster:​0|Home]]**
cluster/36.txt ยท Last modified: 2008/03/17 14:44 (external edit)