This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
cluster:194 [2020/11/11 18:16] hmeij07 [Logs] |
cluster:194 [2021/03/25 18:11] hmeij07 [fndebug] |
||
---|---|---|---|
Line 115: | Line 115: | ||
Virtual IP '' | Virtual IP '' | ||
- | An update goes like this and is not an interruption. Check for and apply updates. They are applied to partner | + | Critical |
+ | |||
+ | You can always Disable Failover, to fix power feed of switches 192.168.0.0/16 or 10.10.0.0/ | ||
+ | |||
+ | Check Box to Disable Failover | ||
+ | Go to WebUI > System > Failover > Click the Box > Then Click Save (leave default controller setting as is) | ||
+ | |||
+ | This will allow you to make your network change without failing over. | ||
==== SSH ==== | ==== SSH ==== | ||
Line 350: | Line 358: | ||
</ | </ | ||
- | ==== Update ==== | + | ==== Update |
+ | |||
+ | See Update 12 for manual update to v12 with Anthony on 03.09.2021 | ||
**Change the Train** to 11.3, then you will apply the update first in the WebUI to the passive controller. | **Change the Train** to 11.3, then you will apply the update first in the WebUI to the passive controller. | ||
Line 416: | Line 426: | ||
You can look that information up yourself by opening an SSH session to the passive controller, navigating to the / | You can look that information up yourself by opening an SSH session to the passive controller, navigating to the / | ||
- | Manual debug file creation, then ftp to ftp.ixsystems.com | + | |
+ | ==== Split Brain ==== | ||
+ | |||
+ | When ending up with an error fail over state try console shutdown first. If that does not work cut power to controllers. | ||
+ | |||
+ | ==== fndebug ==== | ||
+ | |||
+ | * first log into support, then download teamviewer | ||
+ | * https:// | ||
+ | * get.teamviewer.com/ | ||
+ | |||
+ | |||
+ | **Manual debug file creation**, then ftp to ftp.ixsystems.com | ||
< | < | ||
+ | |||
freenas-debug -A | freenas-debug -A | ||
- | tar czvf fndebug.tar.gz / | + | tar czvf fndebug-wesleyan-20201123.tar.gz / |
+ | |||
+ | # next look at bottom of fndebug/ | ||
+ | /dev/da10 HGST: | ||
+ | /dev/da9 HGST: | ||
+ | # these drives have not failed yet but have write errors, offline/ | ||
+ | |||
+ | # next look at output of zpool status -x in fndebug/ | ||
+ | # and the error code | ||
+ | # https:// | ||
+ | |||
+ | NAME STATE READ WRITE CKSUM | ||
+ | tank DEGRADED | ||
+ | ... | ||
+ | raidz2-1 | ||
+ | gptid/ | ||
+ | # look for checksums that have failed like this disk in vdev raidz2-1 | ||
+ | |||
+ | # clean up the spare that resilvered (INUSE status) | ||
+ | # then run a clear on the pool. Then we'll try to get another debug. | ||
+ | |||
+ | zpool detach tank gptid/ | ||
+ | zpool clear tank | ||
+ | |||
+ | # that brought all drives back online and the vdevs show | ||
+ | # then via gui added the available drive back as spare | ||
</ | </ | ||
+ | |||
+ | * Monitor the progress of the resilvering operation: 'zpool status -x' | ||
+ | |||
+ | |||
+ | **Replace a failed drive** | ||
+ | |||
+ | * https:// | ||
+ | * drives mentioned above have not failed yet so we must " | ||
+ | |||
+ | < | ||
+ | |||
+ | 1) Go into the Storage > Pools page. Click the Gear icon next to the pool and press the " | ||
+ | 2) Find da4 and press the three-dot options button next to it, then press " | ||
+ | 3) Go to the System > View Enclosure page, select da4 and press " | ||
+ | 4) Physically swap the drive on the rack with its replacement. | ||
+ | 5) Go back to the Storage > Pool > Status page, bring up the options for the removed drive, | ||
+ | 5a) Select member disk from dropdown, and press " | ||
+ | The replacement drive may or may not have been given the name " | ||
+ | 6) Wait for the drive to finish resilvering before proceeding to replace da3. | ||
+ | 6a) Click spinning icon to view progress. Pool status " | ||
+ | Return the drives in original box, return label provided. | ||
+ | |||
+ | </ | ||
+ | |||
+ | ** Pool Unhealthy but not Degraded status** | ||
+ | |||
+ | No failed disks, no deploy of spare, but pool unhealthy. | ||
+ | |||
+ | < | ||
+ | |||
+ | Mar 21 04:03:57 hpcstore2 (da11: | ||
+ | Mar 21 04:03:57 hpcstore2 (da11: | ||
+ | Mar 21 04:03:57 hpcstore2 (da11: | ||
+ | Mar 21 04:03:57 hpcstore2 (da11: | ||
+ | Mar 21 04:03:57 hpcstore2 (da11: | ||
+ | Mar 21 04:03:57 hpcstore2 (da11: | ||
+ | |||
+ | 1) Storage > Pools. Click gear icon next to the pool and press the " | ||
+ | 2) Find da11 and press the three-dot options button next to it, then press " | ||
+ | 3) System > View Enclosure, find& | ||
+ | 4) Physically swap the drive on the rack with its replacement. | ||
+ | 5) Storage > Pool > Status page, bring up three-dot options for the removed drive, | ||
+ | 5a) Select member disk from drop down, and press " | ||
+ | 6) Wait till resilver finishes. | ||
+ | |||
+ | </ | ||
+ | |||
+ | |||
+ | ==== Update 12 ==== | ||
+ | |||
+ | System > Update > Select (new train 12.0-STABLE) | ||
+ | |||
+ | ** Open a console on both controllers without double ssh sessions, directly to hpcstore1/ | ||
+ | |||
+ | '' | ||
+ | '' | ||
+ | |||
+ | Then download updates on passive, check version '' | ||
+ | |||
+ | '' | ||
+ | |||
+ | '' | ||
+ | |||
+ | ...10%...20%...30%...40%...50%...60%...70%...80%...90%...100% | ||
+ | |||
+ | reboot passive | ||
+ | |||
+ | from active ping passive heartbeat IP, when up | ||
+ | |||
+ | check version passive | ||
+ | |||
+ | check boot env '' | ||
+ | |||
+ | on passive '' | ||
+ | |||
+ | now force fail over via GUI (interruptive for 6o seconds) | ||
+ | |||
+ | Anthony did a reboot on active instead, watch log for personality swap | ||
+ | |||
+ | then update the new passive | ||
+ | |||
+ | '' | ||
+ | |||
+ | '' | ||
+ | |||
+ | then check version, reboot new passive, check version, become new standby | ||
+ | |||
+ | Result: personality switch active vs standby, took 35 mins | ||
+ | |||
+ | In two months: ZFS feature updates pathch, not interruptive, | ||
+ | |||
+ | Storage > Pool > " | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
\\ | \\ |