Checking XFS file system


RobJ

Recommended Posts

Sorry for all the words.  The actual feature request is down near the bottom, highlighted.

Just started some Reiser to XFS conversions, and after completing a full disk copy of 2.5TB, decided to test the new XFS file system, using the WebGUI.  I kept the default of a single option of -n, which should do a read only test.  It finished in a minute or so (surprisingly fast for the number of files and folders!), and returned this report in the output box -

Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify link counts...
No modify flag set, skipping filesystem flush and exiting.

And that's it!  No success or failure note, no obvious indication of errors or not, and some suspicious lines that may or may not indicate a problem.  Especially "moving disconnected inodes to lost+found", does that mean it found some, or it's executing the function that would move them if there were any (which is still a secret!)?

 

For inexperienced users (which I still am here), this is useless and confusing.  I decided to run it again without the -n option, to see if it DID find something to fix, and got the following report -

Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 1
        - agno = 0
        - agno = 2
        - agno = 3
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
done

Again, no unambiguous mention of whether it actually fixed anything at all.  It did perform over 5600 writes to the drive, and it says it zeroed the log and rebuilt and reset numerous things, but doesn't say if they were bad.  So I tried another -n check since if something was fixed I might get a simpler report with no suspicious lines.  I got the identical report I originally received, word for word.  I restarted the array, and found no 'lost+found' folder on the drive, so *I think* there wasn't anything wrong.  I have to say that while the xfs_repair programmer may have done a good job with their repair routines (can't say for sure), they did a lousy job at general usability.  Personally, I'm not happy that the tool performs so many writes even if they aren't needed.  Why not check first, then skip the writes if possible.  Don't fix what isn't broke.

 

I checked the man page and found only one thing we can do to improve this -

Exit Status

xfs_repair -n (no modify node) will return a status of 1 if filesystem corruption was detected and 0 if no filesystem corruption was detected. xfs_repair run without the -n option will always return a status code of 0.

 

My feature request therefore is a check on the error return if the -n option is used, and an appropriate message prominently displayed to the user, something like -

  zero return -> "No corruptions found"

  nonzero return -> "Issues found, rerun without the -n option"

 

Thankfully, the reiserfsck tool does report something like "No corruptions found", or it reports issues as it finds them and provides appropriate instructions.  We need the xfs_repair tool to appear to do the same.

Link to comment

I had almost made up my mind to try to do a reiser-to-XFS conversion on my Media server but this has given reason to put that on hold.  The checking/repair tool obviously needs some real work if it is to be used by anyone except for a expert Linux System Guru.  Or someone needs to put together a 'wrapper' script, program or GUI interface for the present tool which provides some real indication of the exact state of the file system that the average person can at least understand whether there either are or aren't any issues with the file system.  Notice that I am not asking for a fool-proof repair system.  It could well be that if a problem is detected that some expertise may have to be requested to keep that average user out-of-trouble!

Link to comment
  • 6 months later...

Resurrecting this because I had to use xfs_repair for the first time via the webUI. Initially, the drive in question was had the file system set to "auto" which hid the ability to check the file system. Stopping the array and changing that to xfs got the display issue fixed.

 

But running with the option -nv left me just as confused as RobJ. It ran, and nothing in the report stood out to me saying "we found errors - run this again without the -n." Since the disk still would not mount, I figured I had nothing to lose by running xfs_repair -v, so I did and that ended up fixing it.

Link to comment
  • 3 months later...
  • 3 months later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.