\\
**[[cluster:0|Home]]**
//ok, so this story begins with ... i thought i had met my inability to comprehend new technology when i was shown that disks can run multiple raid levels simultaneously. but this multipathing eclipses that. just weird, therefore worth describing.//
--- //[[hmeij@wesleyan.edu|Henk Meij]] 2007/05/05 14:14//
===== The Problem =====
Our new NetApp FAS 3050c device [[http://www.pcquest.com/content/web/2006/106022211.asp|Read About It]] has some new setups. Although we encountered these problems before we were able to fix it by applying a label to the partition filesystem.
Well, this is totally different. Again, i got started on this since gracious Dell installed not one but two HBA cards into the ionode. They must be given us hidden signals. So the HBA cards were connected via fiber cable to network switch #3 and #4, and you guessed it, filer3 and filer 4 are connected respectively.
What's new is that both filers share the same [[http://en.wikipedia.org/wiki/WWPN|WWPN]] (World Wide Port Name, so wikipedia isn't always useful ...). That means when a linux clients sends a query to the filer, both answer.
For example, i created 3 LUNs on a 26*500 GB disk raid group attached to filer3 only! sda is the local disk. sdb, sdc and sdd are my LUNs. and so are sde, sdf and sdg.
* That's good because we can use this multipath to route traffic via two fiber channels to the filers.
* That's bad because we may end up with corruption and confusion.
[root@ionode-1 ~]# fdisk -l
Disk /dev/sda: 79.4 GB, 79456894976 bytes
255 heads, 63 sectors/track, 9660 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sda1 1 9 72261 de Dell Utility
/dev/sda2 * 10 1284 10241437+ 83 Linux
/dev/sda3 1285 1794 4096575 82 Linux swap
/dev/sda4 1795 9660 63183645 5 Extended
/dev/sda5 1795 9660 63183613+ 83 Linux
Disk /dev/sdb: 1099.5 GB, 1099529453568 bytes
255 heads, 63 sectors/track, 133676 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk /dev/sdb doesn't contain a valid partition table
Disk /dev/sdc: 104 MB, 104857600 bytes
4 heads, 50 sectors/track, 1024 cylinders
Units = cylinders of 200 * 512 = 102400 bytes
Disk /dev/sdc doesn't contain a valid partition table
Disk /dev/sdd: 104 MB, 104857600 bytes
4 heads, 50 sectors/track, 1024 cylinders
Units = cylinders of 200 * 512 = 102400 bytes
Disk /dev/sdd doesn't contain a valid partition table
Disk /dev/sde: 1099.5 GB, 1099529453568 bytes
255 heads, 63 sectors/track, 133676 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk /dev/sde doesn't contain a valid partition table
Disk /dev/sdf: 104 MB, 104857600 bytes
4 heads, 50 sectors/track, 1024 cylinders
Units = cylinders of 200 * 512 = 102400 bytes
Disk /dev/sdf doesn't contain a valid partition table
Disk /dev/sdg: 104 MB, 104857600 bytes
4 heads, 50 sectors/track, 1024 cylinders
Units = cylinders of 200 * 512 = 102400 bytes
Disk /dev/sdg doesn't contain a valid partition table
===== Multipath LUNs =====
We install on the ionode the appropriate package, so ...
[root@ionode-1 qlafc-linux-8.01.04-3-install]# rpm -q device-mapper
device-mapper-1.02.02-3.0.RHEL4
device-mapper-1.02.02-3.0.RHEL4
1041 up2date --get device-mapper-multipath
1043 ls /var/spool/up2date/
=>heck where did it go?
1044 find / -name 'device-mapper-multipath-0.4.5-21.RHEL4.x86_64.rpm'
/state/partition1/home/install/ftp.rocksclusters.org/pub/rocks/rocks-4.1.1/rocks-dist\
/rolls/updates/4.1.1/x86_64/RedHat/RPMS/device-mapper-multipath-0.4.5-21.RHEL4.x86_64.rpm
=> oh.
Guess OCS patched up2date, so i moved the RPM to /share/apps/filer, and ...
[root@ionode-1 filer]# rpm -ivh device-mapper-multipath-0.4.5-21.RHEL4.x86_64.rpm
Preparing... ########################################### [100%]
1:device-mapper-multipath########################################### [100%]
NetApp wants to use the HBA vendor drivers, not the Redhat version, after downloading ...
[root@ionode-1 filer]# cd qlogic-stuff/
[root@ionode-1 qlogic-stuff]# cd qlafc-linux-8.01.04-3-install/
[root@ionode-1 qlafc-linux-8.01.04-3-install]# ./qlinstall
#*********************************************************#
# SANsurfer Driver Installer for Linux #
# Installer Version: 1.01.00pre6 #
#*********************************************************#
Kernel version: 2.6.9-34.ELsmp
Distribution: Red Hat Enterprise Linux WS release 4 (Nahant Update 3)
Found QLogic Fibre Channel Adapter in the system
1. QLE2460
Installation will begin for following driver
1. qla2xxx version: v8.01.04
Preparing... ##################################################
qla2xxx ##################################################
QLA2XXX -- Building the qla2xxx driver...
QLA2XXX -- Installing the qla2xxx modules to
/lib/modules/2.6.9-34.ELsmp/kernel/drivers/scsi/qla2xxx/...
Setting up QLogic HBA API library...
Please make sure the /usr/lib/libqlsdm.so file is not in use.
Installing 32bit api binary for x86_64.
Installing 64bit api binary for x86_64.
Done.
Unloading any loaded drivers
Unloaded module qla2400
Loading module qla2xxx_conf version: v8.01.04....
Loaded module qla2xxx_conf
Loading module qla2xxx version: v8.01.04....
Loaded module qla2xxx
Loading module qla2400 version: v8.01.04....
Loaded module qla2400
Installing scli....
Preparing... ##################################################
scli ##################################################
Installation completed successfully.
Building default persistent binding using SCLI
Info: No devices found on HBA port 0. Skipping target persistent
binding configuration.
Info: No devices found on HBA port 1. Skipping target persistent
binding configuration.
Saved copy of /etc/modprobe.conf as
/usr/src/qlogic/v8.01.04-3/backup/modprobe.conf-2.6.9-34.ELsmp-050407-142853.bak
Saved copy of /boot/initrd-2.6.9-34.ELsmp.img as
/usr/src/qlogic/v8.01.04-3/backup/initrd-2.6.9-34.ELsmp.img-050407-142853.bak
QLA2XXX -- Rebuilding ramdisk image...
Ramdisk created.
Reloading the QLogic FC HBA drivers....
Unloaded module qla2400
Loading module qla2xxx_conf version: v8.01.04....
Loaded module qla2xxx_conf
Loading module qla2xxx version: v8.01.04....
Loaded module qla2xxx
Loading module qla2400 version: v8.01.04....
Loaded module qla2400
tee: ql_device_info: Permission denied
Target Information on all HBAs:
==============================
-----------------------------------------------------------------------------
HBA Port 0 - QLE2460 Port Name: 21-00-00-E0-8B-93-AA-3A Port ID: 00-00-00
-----------------------------------------------------------------------------
Info: The selected adapter has no attached devices (HBA port 0)!
-----------------------------------------------------------------------------
HBA Port 1 - QLE2460 Port Name: 21-00-00-E0-8B-93-AC-57 Port ID: 00-00-00
-----------------------------------------------------------------------------
Info: The selected adapter has no attached devices (HBA port 1)!
#*********************************************************#
# INSTALLATION SUCCESSFUL!! #
# SANsurfer Driver installation for Linux completed #
#*********************************************************#
Then load ''scli'' and reconfigure HBA cards, then rebuild ... follow the netapp documentation and don't forget to commit changes to the cards.
[root@ionode-1 qlafc-linux-8.01.04-3-install]# ./qlinstall -l qla2400
Loading module qla2400 version: v8.01.04....
Loaded module qla2400
[root@ionode-1 qlafc-linux-8.01.04-3-install]# ./qlinstall -br -in qla2400
Saved copy of /etc/modprobe.conf as
/usr/src/qlogic/v8.01.04-3/backup/modprobe.conf-2.6.9-34.ELsmp-050407-144528.bak
Saved copy of /boot/initrd-2.6.9-34.ELsmp.img as
/usr/src/qlogic/v8.01.04-3/backup/initrd-2.6.9-34.ELsmp.img-050407-144528.bak
QLA2XXX -- Rebuilding ramdisk image...
Ramdisk created.
[root@ionode-1 qlafc-linux-8.01.04-3-install]# chkconfig --add multipathd
[root@ionode-1 qlafc-linux-8.01.04-3-install]# chkconfig multipathd on
[root@ionode-1 qlafc-linux-8.01.04-3-install]# reboot
After the reboot we have not solved our problem. But we do have 2 functioning HBA cards. Some configuration changes have been applied to the cards. One specifically, after how long a time out should a request be routed to the other card (not shown, but detailed in the NetApp documentation, see /share/app/filer).
[root@ionode-1 ~]# /usr/local/bin/scli -i
-----------------------------------------------------------------------------
Host Name : ionode-1.local
HBA Model : QLE2460
Port : 0
Node Name : 20-00-00-E0-8B-93-AA-3A
Port Name : 21-00-00-E0-8B-93-AA-3A
Port ID : 61-16-13
Serial Number : RFC0644M60202
Driver Version : 8.01.04
FCode Version : 1.13
Firmware Version : 4.00.18
OptionROM BIOS Version : 1.08
OptionROM FCode Version : 1.13
OptionROM EFI Version : 1.02
OptionROM Firmware Version : 4.00.12
Actual Connection Mode : Point to Point
Actual Data Rate : 4 Gbps
PortType (Topology) : FPort
Device Target Count : 0
HBA Status : Online
-----------------------------------------------------------------------------
Host Name : ionode-1.local
HBA Model : QLE2460
Port : 1
Node Name : 20-00-00-E0-8B-93-AC-57
Port Name : 21-00-00-E0-8B-93-AC-57
Port ID : 61-0E-13
Serial Number : RFC0645M67628
Driver Version : 8.01.04
FCode Version : 1.13
Firmware Version : 4.00.18
OptionROM BIOS Version : 1.08
OptionROM FCode Version : 1.13
OptionROM EFI Version : 1.02
OptionROM Firmware Version : 4.00.12
Actual Connection Mode : Point to Point
Actual Data Rate : 4 Gbps
PortType (Topology) : FPort
Device Target Count : 4
HBA Status : Online
--------------------------------------------------------------------------
The problem.
[root@ionode-1 ~]# sanlun lun show all
filer: lun-pathname device filename adapter protocol lun size lun state
filer3: /vol/cluster_home/users /dev/sdi host2 FCP 1.0t (1099529453568) GOOD
filer3: /vol/cluster_home/users /dev/sdg host2 FCP 1.0t (1099529453568) GOOD
filer3: /vol/cluster_home/users /dev/sde host1 FCP 1.0t (1099529453568) GOOD
filer3: /vol/cluster_home/fstarr /dev/sdh host2 FCP 1.0t (1099529453568) GOOD
filer3: /vol/cluster_home/fstarr /dev/sdj host2 FCP 1.0t (1099529453568) GOOD
filer3: /vol/cluster_home/fstarr /dev/sdf host1 FCP 1.0t (1099529453568) GOOD
filer3: /vol/cluster_scratch/lun0 /dev/sdd host2 FCP 1.0t (1099529453568) GOOD
filer3: /vol/cluster_scratch/lun0 /dev/sdc host2 FCP 1.0t (1099529453568) GOOD
filer3: /vol/cluster_scratch/lun0 /dev/sdb host1 FCP 1.0t (1099529453568) GOOD
[root@ionode-1 ~]# multipath -v3 -d -ll
#
# all paths :
#
2:0:2:0 sdb 8:16 [ready] NETAPP /LUN /0.2
2:0:2:1 sdc 8:32 [ready] NETAPP /LUN /0.2
2:0:2:2 sdd 8:48 [ready] NETAPP /LUN /0.2
2:0:3:0 sde 8:64 [ready] NETAPP /LUN /0.2
2:0:3:1 sdf 8:80 [ready] NETAPP /LUN /0.2
2:0:3:2 sdg 8:96 [ready] NETAPP /LUN /0.2
Now for some magic. We edit /etc/multipath.conf and ...
* add a definition for the NetApp device
* blacklist our local disk sda
* add a parameter specifying when to switch path (based on size of single I/O operation that blocks waiting for it to complete
* and reboot
Now we see:
params = 1 queue_if_no_path 0 2 1 round-robin 0 1 1 8:112 500 round-robin 0 2 1 8:80 500 8:144 500
status = 1 0 0 2 1 A 0 1 0 8:112 A 0 E 0 2 0 8:80 A 0 8:144 A 0
mpath2 (360a9800043346d375a6f41794a597852)
[size=1024 GB][features="1 queue_if_no_path"][hwhandler="0"]
\_ round-robin 0 [active]
\_ 2:0:2:2 sdh 8:112 [active]
\_ round-robin 0 [enabled]
\_ 1:0:2:2 sdf 8:80 [active]
\_ 2:0:3:2 sdj 8:144 [active]
params = 1 queue_if_no_path 0 2 1 round-robin 0 1 1 8:96 500 round-robin 0 2 1 8:64 500 8:128 500
status = 1 0 0 2 1 A 0 1 0 8:96 A 0 E 0 2 0 8:64 A 0 8:128 A 0
mpath1 (360a9800043346d375a6f41794a576176)
[size=1024 GB][features="1 queue_if_no_path"][hwhandler="0"]
\_ round-robin 0 [active]
\_ 2:0:2:1 sdg 8:96 [active]
\_ round-robin 0 [enabled]
\_ 1:0:2:1 sde 8:64 [active]
\_ 2:0:3:1 sdi 8:128 [active]
params = 1 queue_if_no_path 0 2 1 round-robin 0 1 1 8:32 500 round-robin 0 2 1 8:16 500 8:48 500
status = 1 133034 0 2 1 E 0 1 0 8:32 F 1194 E 0 2 0 8:16 F 1195 8:48 F 1194
mpath0 (360a9800043346d375a6f41744c635563)
[size=1024 GB][features="1 queue_if_no_path"][hwhandler="0"]
\_ round-robin 0 [enabled]
\_ 2:0:2:0 sdc 8:32 [failed]
\_ round-robin 0 [enabled]
\_ 1:0:2:0 sdb 8:16 [failed]
\_ 2:0:3:0 sdd 8:48 [failed]
This magic hoopla is achieved because something is creating these devices for us, and the manual states
|''The /dev/mapper devices are persistent across reboots, but the /dev/sdx
devices are not. After a reboot, or a restart of the HBA driver, you might find that
different /dev/sdx devices make up a given /dev/mapper device. The /dev/mapper
device, however, will always correspond to the same LUN.''|
[root@ionode-1 ~]# ls -l /dev/mapper/
total 0
crw------- 1 root root 10, 63 May 5 12:29 control
brw-rw---- 1 root disk 253, 0 May 5 12:29 mpath0
brw-rw---- 1 root disk 253, 1 May 5 12:29 mpath1
brw-rw---- 1 root disk 253, 2 May 5 12:29 mpath2
Eureka.
Throw a filesystem on it (no need for a partition table apparently).
[root@ionode-1 ~]# mkfs -t ext3 /dev/mapper/mpath0
mke2fs 1.35 (28-Feb-2004)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
134234112 inodes, 268439808 blocks
13421990 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=4294967296
8193 block groups
32768 blocks per group, 32768 fragments per group
16384 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
102400000, 214990848
Writing inode tables: done
Creating journal (8192 blocks): done
Writing superblocks and filesystem accounting information: done
This filesystem will be automatically checked every 20 mounts or
180 days, whichever comes first. Use tune2fs -c or -i to override.
And now we can mount and export to other nodes via /etc/fstab and /etc/exportfs.
/dev/mapper/mpath0 /sanscratch ext3 defaults 0 0
/sanscratch 10.3.1.0/255.255.255.0(rw,sync,no_root_squash)
There are some linux tuneable parameters, look at header in ionode-1:/etc/multipath.conf.
Yea. Done.
And now we have multiple paths from the ionode to the filers (how they work that out is beyond me) and two HBA cards that also will take over from each other.
to rediscover or discover new LUNs
/opt/netapp/santools/qla2xxx_lun_rescan all
make sure the HBA cards have logged into the filer, ssh into filer3
filer3> igroup show
swallowtail (FCP) (ostype: linux):
21:00:00:e0:8b:93:ac:57 (logged in on: vtic, 0c)
21:00:00:e0:8b:93:aa:3a (logged in on: vtic, 0d)
--- //[[hmeij@wesleyan.edu|Meij, Henk]] 2008/03/17 14:42//
The bindings (mappings) are stored here ''/var/lib/multipath/bindings''
# Multipath bindings, Version : 1.0
# NOTE: this file is automatically maintained by the multipath program.
# You should not need to edit this file in normal circumstances.
#
# Format:
# alias wwid
#
mpath0 360a9800043346d375a6f41794a597852
mpath1 360a9800043346d375a6f4237427a316d
mpath2 360a9800043346d375a6f423743307771
mpath3 360a9800043346d375a6f423743335673
mpath4 360a9800043346d375a6f423743375578
mpath5 360a9800043346d375a6f423743394379
mpath6 360a9800043346d375a6f423837674970
mpath7 360a9800043346d375a6f4238394f516c
mpath8 360a9800043346d375a6f423839536b52
mpath9 360a9800043346d375a6f423838394b71
mpath10 360a9800043346d375a6f424566792f32
mpath11 360a9800043346d375a6f41794a576176
===== Oh baby =====
What next? Oh yea ...
* perform extensive read/writes and fsck
* disable an HBA card and observe while reading/writing
* disable a filer (filer3!) and observe if filer4 takes over (without having the raid group local!)
Monday is "play at work" day.
\\
**[[cluster:0|Home]]**