[SOLVED] Loss of Networking and Unresponsive - Complete System Hang


Recommended Posts

System Hangs completely and is unrecoverable except for hard reset.


Viewing the logs it looks like I have disk issues. Unfortunately, two of the disks are very old, but they were all I had at the time. One disk (I believe it is ata7 in the FCP_Syslog) is brand new.

 

When the system hangs, the hard drive indicator light is stuck on, the system loses all network connectivity (except IMPI, which is handled by the BMC chip), and is unresponsive to local commands. The local monitor also scrolls too fast to read with "XXX printk messages dropped".

 

I unplugged the sata and power cables and replugged them in all the drives, however, this only occurs when the mover is running overnight. I've left FCP in troubleshooting mode on a few occassions and attached the logs that actually had data, and and image of the local monitor scrolling.

 

Are my disks just too old to use, or could it be something more sinister with the motherboard?

UnRaid Log - 2017-04-18.zip

UnRaid Logs - 2017-04-21.zip

IMG_20170417_103833_013.jpg

Edited by YouAreTheOneNeo
Solved
Link to comment

Please provide the full diagnostics (obtained either by Tools->Diagnostics from the GUI or by 'diagnostics' command and the command line.    As well as logs this will include other relevant information such as your configuration details and the SMART reports for all your drives.   This helps immensely when trying to diagnose problems and give advice.

Link to comment
30 minutes ago, itimpi said:

Please provide the full diagnostics (obtained either by Tools->Diagnostics from the GUI or by 'diagnostics' command and the command line.    As well as logs this will include other relevant information such as your configuration details and the SMART reports for all your drives.   This helps immensely when trying to diagnose problems and give advice.

Are they the same as the two sets of diagnostics files i've attached in the two .zip files on my original post?

Link to comment
15 minutes ago, trurl said:

The disks look OK, except for 2 reallocations on parity. Probably a connection or controller issue. Since you don't have many disks, you could try connecting them all to the intel ports and not the marvell ports if they aren't already.

 

That parity drive has had those two reallocated sectors for a few years IIRC. I've been keeping an eye on it since i noticed, and it's never gone up, so it wasn't something I worried about.

 

I just can't work out why it would be freezing the way it is, other than the age of the disks.

 

The cache and parity disks are attached to the onboard controller, the other two are connected via the marvell chip. Due to the restrictions in the case, it'd be a bit of a pain to move them over to the other intel SCU connectors on the board, but it could be done. Is the marvell chip known to cause issues?

Link to comment
16 hours ago, trurl said:

There have been some reports of issues with some marvell on some configurations. Don't know if it's really the problem here or not.

 

16 hours ago, johnnie.black said:

The Marvell ports (4 first white ports) on those Asrock boards are known to be very problematic and drop disks, never use them.

 

Ah okay, I need to rebuild my array now because disk 1 has the red x next to it. I'm assuming that if I move the disks across to different SATA ports, I'll lose all my data, unless I have a fully working array and move them one at a time with a rebuild between disk moves?

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.