bw1 Posted March 2, 2017 Share Posted March 2, 2017 (edited) A couple of days ago I noticed that one of my disks redballed. I figured it was likely bad because of errors that I'd seen and decided to replace it. The drive that I replaced was an Hitachi 2TB and I replaced it with a HGST 4TB Coolspin. And proceeded to rebuild the drive. I'm only seeing between 1 and 2 MB per second rebuild speed. At this rate, it will take about 25 - 30 days to rebuild the drive. That doesn't seem normal. Here is my system info: ASUS M4A785-M Motherboard AMD Sempron 145 CPU 1GB DDR2 RAM Corsair CX500 PSU UnRAID v5.0.6 Parity: Seagate ST4000DM000 4TB Cache: None Array: 10 2TB WD Green, 1 2TB Hitachi, 2 4TB HGST Coolspin (including the new one that's being rebuilt) The new drive being rebuilt is connected to the motherboard. I checked the data and power connections to the drives when I replaced the drive and they seemed OK. I'm seeing this repeatedly in the log: /usr/bin/tail -f /var/log/syslog Mar 2 08:57:22 Tower kernel: ata2.00: configured for UDMA/33 Mar 2 08:57:22 Tower kernel: ata2: EH complete Mar 2 08:57:22 Tower kernel: ata2.00: exception Emask 0x50 SAct 0x0 SErr 0x90a00 action 0xe frozen Mar 2 08:57:22 Tower kernel: ata2.00: irq_stat 0x01400000, PHY RDY changed Mar 2 08:57:22 Tower kernel: ata2: SError: { Persist HostInt PHYRdyChg 10B8B } Mar 2 08:57:22 Tower kernel: ata2.00: failed command: READ DMA EXT Mar 2 08:57:22 Tower kernel: ata2.00: cmd 25/00:00:60:10:d8/00:04:1b:00:00/e0 tag 0 dma 524288 in Mar 2 08:57:22 Tower kernel: res 50/00:00:5f:10:d8/00:00:1b:00:00/e0 Emask 0x50 (ATA bus error) Mar 2 08:57:22 Tower kernel: ata2.00: status: { DRDY } Mar 2 08:57:22 Tower kernel: ata2: hard resetting link Mar 2 08:57:29 Tower kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Mar 2 08:57:29 Tower kernel: ata2.00: configured for UDMA/33 Mar 2 08:57:29 Tower kernel: ata2: EH complete Mar 2 08:57:30 Tower kernel: ata2.00: exception Emask 0x50 SAct 0x0 SErr 0x90a00 action 0xe frozen Mar 2 08:57:30 Tower kernel: ata2.00: irq_stat 0x01400000, PHY RDY changed Mar 2 08:57:30 Tower kernel: ata2: SError: { Persist HostInt PHYRdyChg 10B8B } Mar 2 08:57:30 Tower kernel: ata2.00: failed command: READ DMA EXT Mar 2 08:57:30 Tower kernel: ata2.00: cmd 25/00:00:60:50:d8/00:04:1b:00:00/e0 tag 0 dma 524288 in Mar 2 08:57:30 Tower kernel: res 50/00:00:5f:50:d8/00:00:1b:00:00/e0 Emask 0x50 (ATA bus error) Mar 2 08:57:30 Tower kernel: ata2.00: status: { DRDY } Mar 2 08:57:30 Tower kernel: ata2: hard resetting link Perhaps an issue with the data connection? Any ideas on what to check or how to fix this? syslog-2017-03-02.txt Edited March 5, 2017 by bw1 solved Quote Link to comment
trurl Posted March 2, 2017 Share Posted March 2, 2017 9 minutes ago, bw1 said: Perhaps an issue with the data connection? Probably. Your syslog has rotated. The one you posted is just full of those errors. Without more I can't even tell which disk it is referring to. Can you get us the older logs? They are in /var/log. Once you get this straight you really should consider upgrading. It is difficult to support this old version since most haven't used it in years. Quote Link to comment
bw1 Posted March 2, 2017 Author Share Posted March 2, 2017 1 minute ago, trurl said: Probably. Your syslog has rotated. The one you posted is just full of those errors. Without more I can't even tell which disk it is referring to. Can you get us the older logs? They are in /var/log. Once you get this straight you really should consider upgrading. It is difficult to support this old version since most haven't used it in years. I attached the one that I created when I first started the rebuild. Hope that helps. I haven't checked here in a while. The 6.x version that I downloaded and have actually used on a test server was still beta. I'll definitely be looking into upgrading. syslog-2017-02-28.txt Quote Link to comment
trurl Posted March 2, 2017 Share Posted March 2, 2017 According to that older syslog, ata2 is disk2, but I don't think we can trust that since I assume you rebooted since that log was taken. Have you looked in /var/log for other logs? Did you test the new disk with preclear or anything? Quote Link to comment
bw1 Posted March 2, 2017 Author Share Posted March 2, 2017 8 minutes ago, trurl said: According to that older syslog, ata2 is disk2, but I don't think we can trust that since I assume you rebooted since that log was taken. Have you looked in /var/log for other logs? Did you test the new disk with preclear or anything? Yes, the new disk was precleared 3 times and it's been in storage for a while (about 3 years ago). I assume the disk won't go bad in storage. syslog.1 syslog.2 Quote Link to comment
bw1 Posted March 2, 2017 Author Share Posted March 2, 2017 (edited) Well those attached files don't look very readable! And I didn't zip them. Edited March 2, 2017 by bw1 Quote Link to comment
trurl Posted March 2, 2017 Share Posted March 2, 2017 30 minutes ago, bw1 said: Well those attached files don't look very readable! And I didn't zip them. I could use them OK. Looks like ata2 is disk2, but the disk you are rebuilding is disk3. Stop, shutdown and recheck the connections. The disk3 rebuild isn't going to be good if disk2 can't be read reliably. Quote Link to comment
bw1 Posted March 2, 2017 Author Share Posted March 2, 2017 43 minutes ago, trurl said: I could use them OK. Looks like ata2 is disk2, but the disk you are rebuilding is disk3. Stop, shutdown and recheck the connections. The disk3 rebuild isn't going to be good if disk2 can't be read reliably. OK, I cancelled the rebuild, stopped, shutdown and the connections looked fine, but I pulled the first 3 data connectors from the SS-500 5-in-3 enclosure and reconnected them. Now I'm only getting 200-300 KB/s, so I think I made it worse. syslog-2017-03-02-2.zip Quote Link to comment
trurl Posted March 2, 2017 Share Posted March 2, 2017 Now disk2 and disk4 are resetting connections, so it is worse. Make sure both SATA and power connections are good at both ends. SATA connections should be square on the connector. If you have bundled your cables you may be putting stress on the connection. Quote Link to comment
bw1 Posted March 2, 2017 Author Share Posted March 2, 2017 56 minutes ago, trurl said: Now disk2 and disk4 are resetting connections, so it is worse. Make sure both SATA and power connections are good at both ends. SATA connections should be square on the connector. If you have bundled your cables you may be putting stress on the connection. Thanks for your help. I'll have to check the connections again, but I'll have to do that later. I do have another power supply that I can try and I also have a motherboard, if I need to swap that out. I'll have to check and see if I have more SATA cables. BTW, when I went to shut down, I still had Windows File Explorer connected to the flash share and I was getting errors unmounting the drives. I had shutdown my desktop computer that was previously connected and restart that and then reconnect the browser to the Tower and then I noticed the Parity drive was missing. So I definitely have some kind of connection problem. Quote Link to comment
RobJ Posted March 2, 2017 Share Posted March 2, 2017 Those errors are symptomatic of a loose connection, drive disappearing then reappearing, with line corruption. That could be loose connectors or bad power, and bad power is my best guess. It's possible your power supply is failing, or there are too many drives on this power rail. Quote Link to comment
bw1 Posted March 2, 2017 Author Share Posted March 2, 2017 2 hours ago, RobJ said: Those errors are symptomatic of a loose connection, drive disappearing then reappearing, with line corruption. That could be loose connectors or bad power, and bad power is my best guess. It's possible your power supply is failing, or there are too many drives on this power rail. I thought the CX500 was good for 15+ drives. I only have 14 and they're low power drives. But thanks that will be one thing I will check since I have another PSU available that has a higher output. Quote Link to comment
RobJ Posted March 2, 2017 Share Posted March 2, 2017 11 minutes ago, bw1 said: I thought the CX500 was good for 15+ drives. I only have 14 and they're low power drives. But thanks that will be one thing I will check since I have another PSU available that has a higher output. If that is a Corsair CX500, the Corsair CX series of power supply do NOT have a good reputation! The advice has generally been to buy any Corsair power supply but the CX series. The higher end Corsairs are decent. I'm almost shocked that you have been able to run very long with 14 drives. Bad power supplies fail quicker, and often fail to maintain correct voltages under load. You might try a PSU tester, they're fairly inexpensive. Quote Link to comment
bw1 Posted March 5, 2017 Author Share Posted March 5, 2017 On Thursday, March 02, 2017 at 2:29 PM, RobJ said: If that is a Corsair CX500, the Corsair CX series of power supply do NOT have a good reputation! The advice has generally been to buy any Corsair power supply but the CX series. The higher end Corsairs are decent. I'm almost shocked that you have been able to run very long with 14 drives. Bad power supplies fail quicker, and often fail to maintain correct voltages under load. You might try a PSU tester, they're fairly inexpensive. Yes, it is the Corsair CX500. Like I said when I selected it, it was one of the recommended drives here for up to 15 drives. But maybe it has gone bad. I swapped the PSU out for a Seasonic X-650 and that seems to have fixed it: Data-Rebuild in progress. Total size: 4 TB Current position: 326.82 GB (8%) Estimated speed: 100.22 MB/sec Estimated finish: 611 minutes Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.