We run a lot of XFS storage arrays with hardware raid controllers (Areca, MegaRAID). Rsync is used to pull content from active server to standby server in a continuous loop.
Usually something like this happens; a disk fails, a hot spare deploys, array rebuilds parity, failed disk gets replaced, new hot spare is created. All is well.
Sometimes, and I have noticed a pattern, not all is well; a volume check is in progress, a disk fails, volume check is aborted, a hot spare deploys, array starts rebuilding, XFS panics and takes the file system offline.
Not sure why, few details on the web. Steps to recover from this situation follows below and some guidance on how to deal with detached/corrupted files.
First stop all replication engines or cron jobs that pull content from active server and back it up to standby server. At this time we want nothing to change on standby server while fixing active server.
Next start a volume check with array controller on active server which will identify any bad blocks that should not be used from now on. Observed completion times of 8 to 80 hours with errors reported observed between 0 and 70 million.
Next run yum update -y
on active server . Comment out the array device mount in /etc/fstab
and reboot. The reboot process will tell us the raid controller finds the storage array and all is in normal state hardware wise.
Next mount the storage manually, this may take different forms, try in order…(repeat previous steps for standby server when recover process is done)
mount /dev/sdb1
, if that works check dmesg
for a clean mount message, and you are good to go, the journal log of changes has been applied successfully xfs_repair -n /dev/sdb1
, if that finishes, redo without the -n
flag (stands for no modifications).xfs_repair -L /dev/sdb1
The repair operation will relocate all corrupt files and detached files (includes directories) to a directory called lost+found
(it will be created). Within this directory, paths are lost but some metadata is preserved. See section below on how to deal with that. Major pain.
Next mount the storage
Next on both active server and standby server run a du -hs * | tee du.txt
and observe size differences of directories, reminder this is since last backup cycle happened, but also gives a sense of which directories were most impacted
Next run a refresh from standby to active server, we must perform this operation so that when we restart replication we do not clobber anything on active server with rsync's –delete
flag.
rsync -vac –whole-file –stats /path/to/standy/dir/ active:/path/to/active/dir/
Next unmount and mount the storage so journal log is applied
First create a file of all files found
find /mindstore/lost+found -type f | tee /mindstore/lost+found.listing
The files are in a “inode”/filename format and detached from their path. Any restoration is now a manual, tedious, project. It consist of two steps
file
We'll use this username and show an example
# get user set of files grep yezzyat /mindstore/lost+found.listing > /tmp/foo.log # individual files have a count of 1 awk '{print $11}' /tmp/foo.log |awk -F\/ '{print $4}' | sort -n | uniq -c | sort -n | head # output 1 101005276254 1 101005276255 1 101005276256 1 101005276257 1 101005276258 1 101005276259 1 101005276260 1 101005276261 1 101005276262 1 101005276263 # test first file file /mindstore/lost+found/101005276254 # output /mindstore/lost+found/101005276254: DICOM medical imaging data # another test ls -l /mindstore/lost+found/101005276254 # output -rwxr-xr-x 1 yezzyat psyc 66752 Sep 10 2016 /mindstore/lost+found/101005276254 # look at it with utility "less" even if a binary, it may reveal some information, like ORIGINAL\PRIMARY\V3_NYU_RAW
At this point you are ready to copy this file into your storage area
DO NOT BLINDLY COPY FILES TO YOUR AREA there will be corrupted files which SHOULD NOT be copied.
Lets look at directories, count > 1
awk '{print $11}' /tmp/foo.log |awk -F\/ '{print $4}' | sort -n | uniq -c | sort -n | tail # output 5647 141871011888 5784 23668178350 8681 30256148259 10292 6534304103 10568 118181472697 15704 58087220750 16043 163276263379 17741 116024922424 18883 19388934039 20640 210500547885 # lets examine last directory with 20,640 "chunks" file /mindstore/lost+found/210500547885 # output /mindstore/lost+found/210500547885: directory # more information ls -ld /mindstore/lost+found/210500547885 # output drwxr-xr-x 4 yezzyat psyc 63 Sep 10 2016 /mindstore/lost+found/210500547885 # anything in this directory and anything corrupt? file /mindstore/lost+found/210500547885/* # output /mindstore/lost+found/210500547885/S07: directory /mindstore/lost+found/210500547885/S07BV: directory # again, anything corrupt in say S07BV !Beware of subdirectories! test each file! file /mindstore/lost+found/210500547885/S07BV/* # output /mindstore/lost+found/210500547885/S07BV/S07_Run1_Enc_reconcorrected_firstvol_as_anat.amr: data /mindstore/lost+found/210500547885/S07BV/S07_Run1_Enc_reconcorrected_firstvol.fmr: ASCII text /mindstore/lost+found/210500547885/S07BV/S07_Run1_Enc_reconcorrected_firstvol.stc: data /mindstore/lost+found/210500547885/S07BV/S07_Run1_Enc_reconcorrected.fmr: ASCII text /mindstore/lost+found/210500547885/S07BV/S07_Run1_Enc_reconcorrected.stc: data # now you have more metadata to decide where you are going to copy this to # also beware you might already have this content # and the directory was flagged as corrupt/detached for a different reason # *or* the restore from standby to active pulled the backup copy into place # finally file has an option -f, --files-from FILE read the filenames to be examined from FILE
That's it. Some steps can be automated but the decision where to place the new files is the user's decision. You may wish to make a $HOME/crash-date directory and just put the files/dirs in there.
The directory lost+found
is not used by the server and will eventually disappear (like when space is needed or next crash happens).