Pducharme Posted November 10, 2017 Share Posted November 10, 2017 (edited) Hi, 2 days ago, I wanted to replace my smallest array disk (2TB) by a new 5TB disk. I wanted to first pre-clear it. So I connected it on the server (live) with "iStarUSA 5in3 drive cage" (suppose to work for Live Hot Swap disk...). When I did that, 2 of my other drives (WD Red 3TB of 4.5 years old) went offline and can't be detected anymore! At first, I thought that it might be the drive cage, so I eject them and connected them directly on the Sata + power cable (NOT LIVE, server was powered off). They never came back online, so I suppose that they are in fact dead (and not under warranty). I have 2 parity drives, so I was able to restart the array (unprotected). My plan has changed from the initial plan of replacing 2TB -> 5TB to replace one of the dead 3TB -> 5TB and remove the other one from the array. So this is what I did yet : Completed the pre-clear of the new 5TB; everything is good, drive is ready. Copy the data out of 1 of the 2 failed drive to all other drives in the array (data was still usable from the "virtual" disk it creates when you have a failed drive) My Next step would be this (PLEASE SOMEONE THAT KNOWS HOW TO DEAL WITH THIS CONFIRMS IT'S OK) : Stop the array. Shutdown the server. Try a old hard disk in one of the 2 bay of the drive cage where the failed drive was (just want to be sure the cage won't kill a new drive!) Repeat to test the 2nd bay of the 2nd drive cage. If everything is fine, shutdown the server. install the new 5TB in the drive cage. Power Up the server. Now, how to achieve this ? (Course of actions needed). I think I need to do the replace of the failed drive BEFORE removing the other drive ? I don't want to lose the data of the 2nd failed drive before replacing it with the 5TB drive... : Rebuild the drive that still has data with the 5TB. Remove the other drive from the array (and not replace it). Edited November 19, 2017 by Pducharme Quote Link to comment
JorgeB Posted November 10, 2017 Share Posted November 10, 2017 58 minutes ago, Pducharme said: Now, how to achieve this ? Just assign the new disk to the slot you want to rebuild and when the rebuild is done do a new config without the missing and now empty drive, parity will need to be re-synced. Quote Link to comment
Pducharme Posted November 10, 2017 Author Share Posted November 10, 2017 Thanks @johnnie.black , the drive is rebuilding. It will take a painful 12 hours and 30 minutes. After that, I can just stop the array, remove the disk 6 (the one emptied), than start again the array ? Is the disk6 will stay empty or all drives will get bumped up disk7 --> disk6, disk8 --> disk7, etc... ? Thanks! Quote Link to comment
JorgeB Posted November 10, 2017 Share Posted November 10, 2017 You need to do a new config to remove the emulated disk6, you can then assign the other data drives any way you like. Quote Link to comment
trurl Posted November 10, 2017 Share Posted November 10, 2017 Are you sure you don't want the data on disk6? With dual parity, it should be possible to rebuild it also. And it wasn't really necessary to copy files from the emulated disks to other disks in the array. You could have just rebuilt to save that data. Writing to other drives in the array while it's unprotected, or just reading other drives in the unprotected array to rebuild a disk. Which is more risky? I'm not sure. In the end though, you rebuilt anyway. so you have done both. I would have considered copying the data to drives not in the array (or to another system completely) instead of writing to drives in the unprotected array. Quote Link to comment
JorgeB Posted November 10, 2017 Share Posted November 10, 2017 25 minutes ago, trurl said: Are you sure you don't want the data on disk6? With dual parity, it should be possible to rebuild it also. That's a good point, I assumed he only has one spare and wants to bring the array to a protected state again, but sometimes I assume wrong Quote Link to comment
JorgeB Posted November 10, 2017 Share Posted November 10, 2017 3 minutes ago, johnnie.black said: wants to bring the array to a protected state And technically the array would be protected after one of the failed drives is rebuilt, so If you plan to replace the other drive in the near future you could leave it like that for a while. Quote Link to comment
Pducharme Posted November 11, 2017 Author Share Posted November 11, 2017 2 hours ago, johnnie.black said: And technically the array would be protected after one of the failed drives is rebuilt, so If you plan to replace the other drive in the near future you could leave it like that for a while. Question : I can get a 6TB drive, but I have dual parity drive (2 x 5TB). Can I use the 6TB as one of new Parity than use the older 5TB as data drive ?? OR I need to upgrade both Parity drive to 6TB ?? Quote Link to comment
JorgeB Posted November 11, 2017 Share Posted November 11, 2017 5 hours ago, Pducharme said: Can I use the 6TB as one of new Parity than use the older 5TB as data drive ?? You can, and you can use the parity swap procedure (with parity1 only), so the array will remain protected. Quote Link to comment
Pducharme Posted November 11, 2017 Author Share Posted November 11, 2017 8 hours ago, johnnie.black said: You can, and you can use the parity swap procedure (with parity1 only), so the array will remain protected. Ok. If my dead empty drive is already empty. I can stop my array, then remove the drive, then start the array, and they do the parity swap ? OR, i should do the parity swap first, then when done, take the 5TB from the parity to replace the empty bad drive ? What would be the fastest ? Quote Link to comment
JorgeB Posted November 11, 2017 Share Posted November 11, 2017 Do the parity swap procedure, there are two steps, first parity is copied to new parity disk, than the disable disk rebuilt using the old parity disk, instructions here: https://wiki.lime-technology.com/The_parity_swap_procedure Quote Link to comment
Pducharme Posted November 17, 2017 Author Share Posted November 17, 2017 (edited) On 2017-11-11 at 11:22 AM, johnnie.black said: Do the parity swap procedure, there are two steps, first parity is copied to new parity disk, than the disable disk rebuilt using the old parity disk, instructions here: https://wiki.lime-technology.com/The_parity_swap_procedure Ok, i followed the process. Now, I’m at a Copying 100% since a while.... (1 hour on the 100%), hoving over the 100% say that it is not running. I have a “cancel” button. the disk do not anymore display a blue icon and I haven’t regain control to be able to start the rebuild. Also, the array is displayed as « off-line » at the top , but the status bar mention « array started » in green and drive (the old parity that is now in the bad disk location) still says that the content is emulated... so, do I have to Click on « cancel » then start the array to star the rebuild even if it still say it is copying? the new parity 1 is a 5TB and old was a 5TB, maybe the 100% complete was the first 5TB and now doing something with the remaining 1TB? please advise what I should do now? Edited November 17, 2017 by Pducharme Quote Link to comment
JorgeB Posted November 17, 2017 Share Posted November 17, 2017 (edited) After the copy is done it will zero the rest of the new parity disk, the array will stop when it's done. Edited November 17, 2017 by johnnie.black Quote Link to comment
Pducharme Posted November 17, 2017 Author Share Posted November 17, 2017 (edited) 3 hours ago, johnnie.black said: After the copy is done it will zero the rest of the new parity disk, the array will stop when it's done. How long it could take ? If the whole copy took 13 hours for a 5TB, Should I calculate 1/5 of 13 hours for the 1TB of free space? I checked and it stil show "Copying, 100% completed" since 12h15pm (about 5 hours ago). Since there is no progress on this part, is there a way to check how long it will take ? I suspect it shouldn't be that long to just zeroed 1TB... I tried the "Cancel" and it did nothing. At this point, I don't know what I can do! I need the array online before 7pm when it will be needed! @johnnie.black, any instructions for me ? You seem like the pro of those kind of processes... I added pictures of what I can see. Edited November 17, 2017 by Pducharme Quote Link to comment
JorgeB Posted November 17, 2017 Share Posted November 17, 2017 Post your diagnostics, grab them on the console by typing diagnostics Quote Link to comment
Pducharme Posted November 17, 2017 Author Share Posted November 17, 2017 6 minutes ago, johnnie.black said: Post your diagnostics, grab them on the console by typing diagnostics Server got rebooted!! Now, it doesn't see that it even did the Swap procedure and I needed to restart it from scratch. So it's back at 1%... will take another 13hrs I suppose. Attached is the diagnostic files. unraid-diagnostics-20171117-1830.zip Quote Link to comment
JorgeB Posted November 17, 2017 Share Posted November 17, 2017 Rebooted while getting the diags or before that? Quote Link to comment
Pducharme Posted November 17, 2017 Author Share Posted November 17, 2017 (edited) No, It was before. I triggered it by error (not due to a script error, but my human error). I hope you can spot something obvious from the Diag files the Swap is at 3% now... I'll get my plexers mad at me for tonight hehe Edited November 17, 2017 by Pducharme Quote Link to comment
JorgeB Posted November 18, 2017 Share Posted November 18, 2017 I found it strange that the previous attempt was so long stuck at 100% so did some testing and there is a bug on the newer rcs with the parity swap procedure if using a cleared disk as the new parity, so it will get stuck again, I recommend you cancel the procedure, downgrade to v6.3.5 and start over. Quote Link to comment
Pducharme Posted November 18, 2017 Author Share Posted November 18, 2017 1 hour ago, johnnie.black said: I found it strange that the previous attempt was so long stuck at 100% so did some testing and there is a bug on the newer rcs with the parity swap procedure if using a cleared disk as the new parity, so it will get stuck again, I recommend you cancel the procedure, downgrade to v6.3.5 and start over. Ok, I tried to downgrade to 6.3.5. Now, the server doesn't boot at all !! Here is the screenshot : Quote Link to comment
Pducharme Posted November 18, 2017 Author Share Posted November 18, 2017 Finally, I had to remove the EFI folder from the flash drive, so it is now back on 6.3.5. I started for the 3rd time the parity swap procedure. I HOPE to get up tomorrow and be greated by the Stopped Array ready to start the rebuild of the failed drive on the ex-parity drive Thanks a lot for the support. Going to bed now, i'll update the thread tomorrow. Quote Link to comment
Pducharme Posted November 18, 2017 Author Share Posted November 18, 2017 Array did stop like supposed too. I started rebuild. Thank you! Ill update to resolved when the rebuild completes. Quote Link to comment
Pducharme Posted November 19, 2017 Author Share Posted November 19, 2017 It is finished, everything is back to normal. Thank you guys for the help. My 52TB Array now working correctly Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.