User Tools

Site Tools


cluster:205

This is an old revision of the document!



Back

XFS panic

We run a lot of XFS storage arrays with hardware raid controllers (Areca, MegaRAID). Rsync is used to pull content from active server to standby server in a continuous loop.

Usually something like this happens; a disk fails, a hot spare deploys, array rebuilds parity, failed disk gets replaced, new hot spare is created. All is well.

Sometimes, and I have noticed a pattern, not all is well; a volume check is in progress, a disk fails, volume check is aborted, a hot spare deploys, array starts rebuilding, XFS panics and takes the file system offline.

Not sure why, few details on the web. Steps to recover from this situation follows below and some guidance on how to deal with detached/corrupted files.

First stop all replication engines or cron jobs that pull content from active server and back it up to standby server. At this time we want nothing to change on standby server while fixing active server.

Next start a volume check with array controller on active server which will identify any bad blocks that should not be used from now on. Observed completion times of 8 to 80 hours with errors reported observed between 0 and 70 million.

Next run yum update -y on active server . Comment out the array device mount in /etc/fstab and reboot. The reboot process will tell us the raid controller finds the storage array and all is in normal state hardware wise.

Next mount the storage manually, this may take different forms, try in order…(repeat previous steps for standby server when recover process is done)

  • mount /dev/sdb1, if that works check dmesg for a clean mount message, and you are good to go, the journal log of changes has been applied successfully
  • if that fails with a “structure needs cleaning” message, unmount device, and try xfs_repair -n /dev/sdb1, if that finishes, redo without the -n flag (stands for no modifications).
  • if that fails, we're definitely going to lose content, run the command while zeroing out the journal log xfs_repair -L /dev/sdb1

The repair operation will relocate all corrupt files and detached files (includes directories) to a directory called lost+found (it will be created). Within this directory, paths are lost but some metadata is preserved. See section below on how to deal with that. Major pain.

Next mount the storage

Next on both active server and standby server run a du -hs * | tee du.txt and observe size differences of directories, reminder this is since last backup cycle happened, but also gives a sense of which directories were most impacted

Next run a refresh from standby to active server, we must perform this operation so that when we restart replication we do not clobber anything on active server with rsync's –delete flag.

  • rsync -vac –whole-file –stats /path/to/standy/dir/ active:/path/to/active/dir/

Next unmount and mount the storage so journal log is applied

lost+found

First create a file of all files found

  • find /mindstore/lost+found -type f | tee /mindstore/lost+found.listing

The files are in a “inode”/filename format and detached from their path. Any restoration is now a manual, tedious, project. It consist of two steps

  1. find direrctories by username and identify files
  2. make sure files are not corrupted using command file

We'll use this username and show an example

# get user set of files
grep yezzyat /mindstore/lost+found.listing > /tmp/foo.log

# individual files have a count of 1
awk '{print $11}' /tmp/foo.log |awk -F\/ '{print $4}' | sort -n | uniq -c | sort -n | head

# output
      1 101005276254
      1 101005276255
      1 101005276256
      1 101005276257
      1 101005276258
      1 101005276259
      1 101005276260
      1 101005276261
      1 101005276262
      1 101005276263

# test first file
file /mindstore/lost+found/101005276254

# output 
/mindstore/lost+found/101005276254: DICOM medical imaging data

# another test
ls -l /mindstore/lost+found/101005276254

# output 
-rwxr-xr-x 1 yezzyat psyc 66752 Sep 10  2016 /mindstore/lost+found/101005276254

# look at it with utility "less" even if a binary, it may reveal some information, like
ORIGINAL\PRIMARY\V3_NYU_RAW

At this point you are ready to copy this file into your storage area

DO NOT BLINDLY COPY FILES TO YOUR AREA there will be corrupted files which SHOULD NOT be copied.

Lets look at directories, count > 1

awk '{print $11}' /tmp/foo.log |awk -F\/ '{print $4}' | sort -n | uniq -c | sort -n | tail

# output
   5647 141871011888
   5784 23668178350
   8681 30256148259
  10292 6534304103
  10568 118181472697
  15704 58087220750
  16043 163276263379
  17741 116024922424
  18883 19388934039
  20640 210500547885

# lets examine last directory with 20,640 "chunks"
file /mindstore/lost+found/210500547885

# output
/mindstore/lost+found/210500547885: directory

# more information
ls -ld /mindstore/lost+found/210500547885

# output
drwxr-xr-x 4 yezzyat psyc 63 Sep 10  2016 /mindstore/lost+found/210500547885

# anything in this directory and anything corrupt?
file /mindstore/lost+found/210500547885/*

# output
/mindstore/lost+found/210500547885/S07:   directory
/mindstore/lost+found/210500547885/S07BV: directory

# again, anything corrupt in say S07BV  !Beware of subdirectories! test each file!
file /mindstore/lost+found/210500547885/S07BV/* 

# output
/mindstore/lost+found/210500547885/S07BV/S07_Run1_Enc_reconcorrected_firstvol_as_anat.amr: data
/mindstore/lost+found/210500547885/S07BV/S07_Run1_Enc_reconcorrected_firstvol.fmr:         ASCII text
/mindstore/lost+found/210500547885/S07BV/S07_Run1_Enc_reconcorrected_firstvol.stc:         data
/mindstore/lost+found/210500547885/S07BV/S07_Run1_Enc_reconcorrected.fmr:                  ASCII text
/mindstore/lost+found/210500547885/S07BV/S07_Run1_Enc_reconcorrected.stc:                  data


# now you have more metadata to decide where you are going to copy this to

# also beware you might already have this content
# and the directory was flagged as corrupt for a different reason
# *or* the restore from standby to active pulled the backup copy into place

# finally file has an option
  -f, --files-from FILE      read the filenames to be examined from FILE

That's it. Some steps can be automated but the decision where to place the new files is the user's decision. You may wish to make a $HOME/crash-date directory and just put the files/dirs in there.


Back

cluster/205.1621537077.txt.gz · Last modified: 2021/05/20 14:57 by hmeij07