BradJ Posted July 24, 2017 Author Share Posted July 24, 2017 (edited) Followed your directions: Stopped the array, blkdiscard /dev/sdh (verified sdh is Intel Drive), unassigned Intel drive from array, started array. Now the other cache drive shows as unmountable. Cache data is not super critical but I would like to save it of course. Diagnostics attached. tower-diagnostics-20170724-1423.zip FireShot Capture 2 - Tower_Main - http___192.168.1.100_Main.pdf Edited July 24, 2017 by BradJ Quote Link to comment
JorgeB Posted July 24, 2017 Share Posted July 24, 2017 Something weird is going on with that pool, it's complaining it's missing a device but the other device was supposedly unused, I believe best option it to use btrfs restore on it, see instructions here (#2): Quote Link to comment
BradJ Posted July 24, 2017 Author Share Posted July 24, 2017 Okay, running the restore command now. I'll format the drives(s) and re-create the array once it's done. In the mean time, should I order a new SSD drive to replace the Intel drive now? I'd rather not mess with an intermittent drive. What concerns me about this drive is my experience with it. I had this Intel SSD drive in a Windows laptop, and after a while it went intermittent and then completely failed. The drive that is in my server is the warranty replacement drive. I'm not sure if there is a chronic problem with this model drive or not. Regardless, I'd gladly replace it if you feel it might be causing the issues I am experiencing. Quote Link to comment
JorgeB Posted July 24, 2017 Share Posted July 24, 2017 If that drive was problematic before I would consider replacing it, but you can also give it a second chance on the onboard controller, if more issues then definitely replace. Quote Link to comment
BoHiCa Posted July 24, 2017 Share Posted July 24, 2017 If the problematic drive (and replacement) are Intel 320 series, be warned. I was rudely reminded of this one earlier this week. Google "Intel 8MB Bug" for a refresher. Intel pooped the bed on this one! Best consensus advice is if you have any of the Intel 320 series SSDs (any size) in use, *immediately* back them up and take them out of service. Apparently even the firmware fix doesn't resolve the problem. Reminds me of the 20 GiB Seagate bricks from the late 1990's. Quote Link to comment
BradJ Posted July 24, 2017 Author Share Posted July 24, 2017 3 hours ago, BoHiCa said: If the problematic drive (and replacement) are Intel 320 series, be warned. I was rudely reminded of this one earlier this week. Google "Intel 8MB Bug" for a refresher. Intel pooped the bed on this one! Best consensus advice is if you have any of the Intel 320 series SSDs (any size) in use, *immediately* back them up and take them out of service. Apparently even the firmware fix doesn't resolve the problem. Reminds me of the 20 GiB Seagate bricks from the late 1990's. Mine is an Intel 330 240GB SSD. Do you know if they fixed the issue in the 330 series? Quote Link to comment
BradJ Posted July 25, 2017 Author Share Posted July 25, 2017 jonnie.black, Just wanted to give you an update. I brought the cache array back on line after formatting the #1 Samsung SSD. Restored my data. Then I brought the Intel SSD drive into the cache array to have redundancy. That's when all kinds of BTRFS errors starting showing up in the log. Something must be failing on this drive. New WD 512GB SSD ordered from Newegg. I'd rather not waste any more of our time on a suspect drive. One last question as I wait 4-7 days for the new drive: Should I remove the Intel drive while I wait for the new drive? I have all the appdata backed up from the btrfs restore so I don't see much of a risk just using one cache drive for a week or so. What do you recommend? tower-diagnostics-20170724-2219.zip Quote Link to comment
JorgeB Posted July 25, 2017 Share Posted July 25, 2017 Only btrfs errors I'm seeing for know are a corrupt libvirt.img (VM manager), you'll need to recreate it, after that I'd leave both SSDs for now and keep monitoring, mostly to confirm if there's really a problem with the Intel SSD. Quote Link to comment
BradJ Posted July 25, 2017 Author Share Posted July 25, 2017 Ok, deleted libvirt, rebooted, ran a non-correcting scrub with no errors found, log initially has no errors. I will monitor for errors over the next few days. If any errors come up I'm just going to put in the new SSD drive. I'll give you an update as events unfold. Thanks again jonnie.black! Quote Link to comment
BradJ Posted August 1, 2017 Author Share Posted August 1, 2017 (edited) It's been almost a week and all appears to be working. I don't know if you work for Lime Technology or not, but these forums wouldn't be the same without you. THANK YOU johnnie.black! Edited August 1, 2017 by BradJ Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.