YouAreTheOneNeo

[SOLVED] Loss of Networking and Unresponsive - Complete System Hang

12 posts in this topic

System Hangs completely and is unrecoverable except for hard reset.


Viewing the logs it looks like I have disk issues. Unfortunately, two of the disks are very old, but they were all I had at the time. One disk (I believe it is ata7 in the FCP_Syslog) is brand new.

 

When the system hangs, the hard drive indicator light is stuck on, the system loses all network connectivity (except IMPI, which is handled by the BMC chip), and is unresponsive to local commands. The local monitor also scrolls too fast to read with "XXX printk messages dropped".

 

I unplugged the sata and power cables and replugged them in all the drives, however, this only occurs when the mover is running overnight. I've left FCP in troubleshooting mode on a few occassions and attached the logs that actually had data, and and image of the local monitor scrolling.

 

Are my disks just too old to use, or could it be something more sinister with the motherboard?

UnRaid Log - 2017-04-18.zip

UnRaid Logs - 2017-04-21.zip

IMG_20170417_103833_013.jpg

Edited by YouAreTheOneNeo
Solved
0

Share this post


Link to post
Share on other sites

Please provide the full diagnostics (obtained either by Tools->Diagnostics from the GUI or by 'diagnostics' command and the command line.    As well as logs this will include other relevant information such as your configuration details and the SMART reports for all your drives.   This helps immensely when trying to diagnose problems and give advice.

0

Share this post


Link to post
Share on other sites
30 minutes ago, itimpi said:

Please provide the full diagnostics (obtained either by Tools->Diagnostics from the GUI or by 'diagnostics' command and the command line.    As well as logs this will include other relevant information such as your configuration details and the SMART reports for all your drives.   This helps immensely when trying to diagnose problems and give advice.

Are they the same as the two sets of diagnostics files i've attached in the two .zip files on my original post?

0

Share this post


Link to post
Share on other sites

The disks look OK, except for 2 reallocations on parity. Probably a connection or controller issue. Since you don't have many disks, you could try connecting them all to the intel ports and not the marvell ports if they aren't already.

0

Share this post


Link to post
Share on other sites
15 minutes ago, trurl said:

The disks look OK, except for 2 reallocations on parity. Probably a connection or controller issue. Since you don't have many disks, you could try connecting them all to the intel ports and not the marvell ports if they aren't already.

 

That parity drive has had those two reallocated sectors for a few years IIRC. I've been keeping an eye on it since i noticed, and it's never gone up, so it wasn't something I worried about.

 

I just can't work out why it would be freezing the way it is, other than the age of the disks.

 

The cache and parity disks are attached to the onboard controller, the other two are connected via the marvell chip. Due to the restrictions in the case, it'd be a bit of a pain to move them over to the other intel SCU connectors on the board, but it could be done. Is the marvell chip known to cause issues?

0

Share this post


Link to post
Share on other sites
55 minutes ago, YouAreTheOneNeo said:

Is the marvell chip known to cause issues?

There have been some reports of issues with some marvell on some configurations. Don't know if it's really the problem here or not.

0

Share this post


Link to post
Share on other sites

The Marvell ports (4 first white ports) on those Asrock boards are known to be very problematic and drop disks, never use them.

0

Share this post


Link to post
Share on other sites
16 hours ago, trurl said:

There have been some reports of issues with some marvell on some configurations. Don't know if it's really the problem here or not.

 

16 hours ago, johnnie.black said:

The Marvell ports (4 first white ports) on those Asrock boards are known to be very problematic and drop disks, never use them.

 

Ah okay, I need to rebuild my array now because disk 1 has the red x next to it. I'm assuming that if I move the disks across to different SATA ports, I'll lose all my data, unless I have a fully working array and move them one at a time with a rebuild between disk moves?

0

Share this post


Link to post
Share on other sites

You can shutdown and swap ports, it won't affect your data or the ability to rebuild the disabled disk.

0

Share this post


Link to post
Share on other sites
1 minute ago, johnnie.black said:

You can shutdown and swap ports, it won't affect your data or the ability to rebuild the disabled disk.

 

Okay thanks, I'll have a go when I get access to the PC later tonight.

0

Share this post


Link to post
Share on other sites

Switching the SATA ports has led to a stable configuration, at least for the last two days. Does anyone know what the issue with those Marvell controllers is, and if it will be fixed?

0

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now

Copyright © 2005-2017 Lime Technology, Inc. unRAID® is a registered trademark of Lime Technology, Inc.