One of my drives went bad, how long do I have?


Guest

Recommended Posts

Hey all,

 

My server is running with three drives - one parity, and two data drives. From now on, I'll refer the parity drive (1TB) as to `parity`, the data drives as to `data1` (500GB) and `data2` (500GB) respectively.

 

So `data2` drive is kicking the dust this week, sending me SMART reports of read errors all week long. The drive is still green-balled, and the SMART report shows the overall health has passed, but I am starting to suspect this drive won't be spinning for too long. Current pending sectors are 160 (bad) and offline un-correctable are 17 (super bad). I've had various issues with this drive in the past, but I was such a cheapskate, I just added the drives after I confirmed the drive was semi-OK with pre-clear.

 

Now I'm not going to cheap out on storage anymore, and purchased two WD Blue 1TB drives from online. (Why Blue? Because I'm.. well.. still a cheapskate. Lol) So while my drives come in the post, I'm wondering how much my drives can hold up. I'm not writing anything to the drives any more (the array has now been put into maintenance mode to prevent further writes) I'm also wondering replacement strategies. Now I know I should pre-clear drives first prior to adding them to make sure they'll hold my array up well, but because `data2` went bad I'm panicking a bit. It hasn't red-balled or anything, still a green-ball... And I'm also aiming to replace `data1` because it's a 500GB drive and I want to increase my storage space. (1TB to 2TB :))

 

Also, I'm out of drive space on my server. I'm using a regular micro-tower sized PC case as my server. Is it OK if I just put them on the floor of the case? Or maybe duct tape them to the floor of the case? I did look for some 5.25 to 3.5 enclosures but they're pricey.

 

To rewrap:

1. `data2` failed (or so I think.) How long do I have?

2. Preclear first? Or add first and preclear drives one by one later?

3. Switch `data2`, let it rebuild, then switch `data1`, let it rebuild? Or is there a non-destructive way of switching `data1`?

4. OK to put drives on case floor? Or duct tape them to floor?

 

Diags attached. They're pre-maintenance mode diagnostics.

 

derrickserver-diagnostics-20171023-1911.zip

 

 

EDIT1: At this time of writing, pending sectors are 144, uncorrectable at 18. Ouch.

 

All SMART extended tests fail with "Errors occured - Check SMART report". This is the last test:

 

# 1  Extended offline    Completed: read failure       00%     15050         254704688

Double ouch.

Edited by Guest
Link to comment
1. `data2` failed (or so I think.) How long do I have?

  Not psychic, but if another disk fails you're likeky going to lose data.

 

 

2. Preclear first? Or add first and preclear drives one by one later?

  In this case I would add it immediately, and you cant't preclear after.

 

 

3. Switch `data2`, let it rebuild, then switch `data1`, let it rebuild? Or is there a non-destructive way of switching `data1`?

  Replace disk2 first, then upgrade disk1.

 

 

4. OK to put drives on case floor? Or duct tape them to floor?

  I wouldn't, but if you do at least make sure cooling is adequate.

 

 

 

 

 

 

 

Link to comment
23 hours ago, johnnie.black said:

Not psychic, but if another disk fails you're likeky going to lose data.

Haha of course! But I was asking in the terms of, like, when might the drive go red-balled or such? That it's completely unrecoverable?

23 hours ago, johnnie.black said:

In this case I would add it immediately, and you cant't preclear after.

OK then, but then I have another question: suppose I put in the new 1TB drive right away to make sure the array won't die, then start the preclear on the second new 1TB drive, then if it passes, swap the two new drives, and preclear the first new 1TB drive? Is this a good idea?

23 hours ago, johnnie.black said:

I wouldn't, but if you do at least make sure cooling is adequate.

Not sure if I want to say my cooling is adequate with confidence (spoiler alert: probably not). I can still assure the case has positive airflow from the front of the case, and air is blown out through the back.

 

23 hours ago, trurl said:

If you're not increasing the number of drives why do you need to do this?

Simple. My server has three bays - two 3.5s and one 2.5 cage. I don't know if the diagnostics show but the failing drive `disk2` is a 2.5 inch drive.

 

The new two drives are 3.5, so I would have to duct tape both to the floor of the case.

 

I plan to keep the 500GB as a hot spare in case things go wrong (but realistically it probably couldn't replace the 1TB drives once they go into the array)

Link to comment
12 minutes ago, ideaman924 said:

when might the drive go red-balled or such?

 

When unRAID tries to write something to it and there's a write error.

 

13 minutes ago, ideaman924 said:

suppose I put in the new 1TB drive right away to make sure the array won't die, then start the preclear on the second new 1TB drive, then if it passes, swap the two new drives, and preclear the first new 1TB drive? Is this a good idea?

 

You can but it seems unnecessary to me, if the disk has a problem detectable by preclear it will most likely also fail during the rebuild or on the first parity check after that.

 

14 minutes ago, ideaman924 said:

Not sure if I want to say my cooling is adequate with confidence (spoiler alert: probably not). I can still assure the case has positive airflow from the front of the case, and air is blown out through the back.

 

Check the temps after use, ideally it should always be below 40C, 45C tops.

 

Link to comment
53 minutes ago, trurl said:

And I think it's possible that unRAID would try to write to the disk if it has a read error, since it will try to calculate the data from parity and then write it back to the disk.

 

Correct, I meant that but was not very clear for people unaware on how unRAID works on a read error, thanks for clarifing.

Link to comment

OK, replacement disks arrived today. This is probably the most ghetto setup ever:

 

IMG_0314.thumb.jpg.8ac32f7bee7b7ff9c550f2b6c5003de9.jpg

 

New two disks on the left side, the dying disk was on the right side where all the array of holes are (2.5 enclosure) , which isn't shown because it's not there anymore.

 

Rebuild started and preclear started. Once preclear completes I'll take out the other 500GB disk and start rebuilding again.

Link to comment
On 25.10.2017 at 2:02 PM, ideaman924 said:

OK, replacement disks arrived today. This is probably the most ghetto setup ever:

 

IMG_0314.thumb.jpg.8ac32f7bee7b7ff9c550f2b6c5003de9.jpg

 

New two disks on the left side, the dying disk was on the right side where all the array of holes are (2.5 enclosure) , which isn't shown because it's not there anymore.

 

Rebuild started and preclear started. Once preclear completes I'll take out the other 500GB disk and start rebuilding again.

 

Your bottom disk will probably run very hot. Also the disk above might get hot as well because it will absorb heat from the disk below it. If you want to place them like this in the bottom of the case you are much better off if you put them on their sides with some air around them. Also, you can strip a fan somewhere, just to create some airflow along the drives. Any fan you might have laying around might to the job. At least your drives won’t cook. Also, make some kind of support so they don't tip over if you move the case.

Link to comment

Order a pair of these 3.5 to 5.25 adapters or something similar. Those are all of $5 each with free shipping (at this particular moment in time) from the geek's favorite store named after not-old eggs. ;)

 

Worst case scenario, slide a couple of #2 Ticonderoga pencils between those drives. The wood will insulate, the hexagon shape will prevent rolling, and the gap will allow air through!

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.