New red ball while rebuilding


Recommended Posts

Running version 5.0.5.  Got a red ball on one of my disks (PL1331LAGRTP6H) a week or so ago.  Smart report and test came back ok, so figure it was the sata controller or the cable.  Moved the drive to a different controller with a different cable and went through the unassign and then reassign the disk procedure to start rebuilding it.
 
A few hours into the rebuild another disk (PL1331LAGSAUDH) red balled which seemed to have halted the rebuild.  Unfortunately the smart report shows the drive is failing.  I can still access the drive, however the data looks corrupt and I can only see ~12gb of the 3.7tb via samba.
 
So as of now, my array has PL1331LAGRTP6H with an orange ball that was in the middle of rebuilding and now PL1331LAGSAUDH with a red ball that seems screwed.
 
Have a new 6tb drive handy.  Have attached smart reports for both drives and a syslog from last reboot.  Sorry, should of saved the syslog when the second drive fell over.  Currently have the array stopped and it's showing 'configuration valid'.  Hope someone can help.  Cheers.

syslog

PL1331LAGRTP6H.txt

PL1331LAGSAUDH.txt

Capture.JPG

Link to comment
7 hours ago, philouza said:

I can still access the drive, however the data looks corrupt and I can only see ~12gb of the 3.7tb via samba.

 

That is normal since you have 2 invalid disks with single parity, unRAID can't correctly emulate the missing data.

 

Assuming disk12 data is unchanged since it first became disable, you can do this, but it's possible disk12 will have some corruption, though very little, because it stopped during the rebuild:

 

-Utils -> New Config
-re-assign all disks, double check parity is the parity slot
-check "parity is already valid" before starting the array
-start the array

 

Now check if data on disks 11 and 12 looks OK.


 

Link to comment
On 19/08/2017 at 6:30 PM, johnnie.black said:

Now check if data on disks 11 and 12 looks OK

 

Thanks so much for the reply.  Did as you instructed and the array came back up.  Kudos.  Started a parity check and disk 11 eventually red balled again killing the check.  Tons of write errors, so confident the drive is shot.  See syslog _disk11_fail attached.  Replaced with a new 6tb and currently rebuilding which seems to be going ok with the exception of these errors in the current syslog..

 

Aug 20 19:22:00 Harvey kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 24537 does not match to the expected one 1
Aug 20 19:22:00 Harvey kernel: REISERFS error (device md11): vs-5150 search_by_key: invalid format found in block 345831560. Fsck?
Aug 20 19:22:00 Harvey kernel: REISERFS error (device md11): vs-13070 reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [587 1500 0x0 SD]

Attached that syslog below too. Assuming I need to wait till the rebuild is finished, would I then run a reiserfsck --check against disk 11, or should I kick off a parity check first and go from there?

syslog_disk11_fail

syslog

Link to comment
1 minute ago, philouza said:

Assuming I need to wait till the rebuild is finished, would I then run a reiserfsck --check against disk 11,

 

Once the rebuild finishes run reiserfsck, there may also be some file corruption because we made parity valid when it really wasn't, but with 2 invalid disks it was your best option, keep old disk11 intact, you may still be able to copy some/most data if needed.

 

Consider upgrading to unRAID v6 and using dual parity, IMO it's recommended for your array size.

Link to comment

Ok ran reiserfsck --check /dev/md11 and got the following...

 

Comparing bitmaps..vpf-10640: The on-disk and the correct bitmaps differs.
Bad nodes were found, Semantic pass skipped
50 found corruptions can be fixed only when running with --rebuild-tree

 

Am I good just running reiserfsck --rebuild-tree /dev/md11 or should I add any options like '-S' or '--scan-whole-partition'?

Link to comment

Normal to not see your files during the rebuild-tree?  All my shares (NFS and Samba) are empty and tons of these in the syslog...

 

Aug 21 19:32:03 Harvey kernel: REISERFS error (device md11): vs-5150 search_by_key: invalid format found in block 0. Fsck?
Aug 21 19:32:03 Harvey kernel: REISERFS error (device md11): zam-7001 reiserfs_find_entry: io error
Aug 21 19:32:03 Harvey emhttp: get_filesystem_status: statfs: /mnt/user/Games Input/output error
Aug 21 19:32:03 Harvey emhttp: get_filesystem_status: statfs: /mnt/user/Merkwell Input/output error
Aug 21 19:32:03 Harvey emhttp: get_filesystem_status: statfs: /mnt/user/Public Input/output error
Aug 21 19:32:03 Harvey kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 0 does not match to the expected one 6553
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.