[Solved] Crash after approx 24 hours


benyaki

Recommended Posts

Having an issue with a server.

The problem initially started with the USB drive not being recognized and not booting (just a halt on boot). I upgraded to 6.2.4 and tried to boot but was having problems still. Made the USB bootable on another mac (only have macs) that was running a different OS version and it finally boots (even though no error were given before when making the USB bootable etc).

 

Checked that everything was working, fixed a few settings that were changed and the server runs fine for approx 24 hours, then becomes unresponsive. I cannot access the webGUI, cannot telnet (it will go as far as "Connected to tower. Escape character is '^]'." but does not go an additional line for login. The console also does not respond. One thing that I just noted was when plugging in a USB keyboard, it would not lower up any of the lights (caps lock, num lock etc) if that helps to diagnose the problem at all.

I have also tried replacing the files (upgrade again with same version).

 

UNRAID version 6.2.4

Motherboard: MSI 880-GMA-E53 (MS-7623) - upgraded to latest BIOS when the USB drive was not being recognized

CPU: AMD Sempron 140 (with only 1 core right now, did not bother unlocking the second until things start working correctly)

RAM - 2GB

Gigabit ethernet

All HDD's are running off the motherboard SATA ports. There is an additional SATA card installed but not being used at this time.

All SATA cables checked for connectivity problems, changed RAM slots to ensure no problems there (this is all before I was able to boot the system)

PSU - Antec 500w?

HDDs 5x 2.0TB WD EARX (1 parity, 4 data)

Cache 500GB

 

Plugins:

 

Sickbeard and couchpotato (PHAZE)

Nerd tools (cannot install PERL for some reason...)

See image for additional mostly stock plugins

 

Docker:

 

Plex

SABnzbd

 

Attached are some screenshots of the setup, plugins, docker, system info etc as well as a syslog (sorry, cannot capture a syslog once it stops working, this is just after it has booted up, wondering if there is something in there that someone can pick out). Have started a tail

 

syslog.txt

main.jpg.70a2693f00766bf3c9fc95193a345deb.jpg

Link to comment

It has been running fine for quite a long time on UNRAID 6 without issue. This is only a recent problem. Any suggestions or other information that I could provide. I will try to capture a syslog if possible after a crash, however everything seems to lock up.

You might try running it in SAFE mode without any running dockers and see if it stays up. If so then I still think the additional load from your plugins and dockers is too much for your hardware.
Link to comment

Just was looking through the syslog and cane across some things that might be suspicious and causing problems - check items related to plex - notes out of memory (appears after library update)?

I just ran a library update on one section, as well as playing a file remotely (transcoded) and ram usage stayed around 50%. Load was around 0.8 with the single core being used.

Suggestions if it is hardware? - start with unlock the second core and up the ram to say 4GB?

 

New diagnostics attached

tower-diagnostics-20170120-0729.zip

Link to comment

Just was looking through the syslog and cane across some things that might be suspicious and causing problems - check items related to plex - notes out of memory (appears after library update)?

I just ran a library update on one section, as well as playing a file remotely (transcoded) and ram usage stayed around 50%. Load was around 0.8 with the single core being used.

Suggestions if it is hardware? - start with unlock the second core and up the ram to say 4GB?

 

New diagnostics attached

 

My suggestion would be to buy a pair of 2Gb modules and have 6GB of RAM in the system.  Don't unlock that second core until you solve the lockup/crash problem.  (One change/issue at a TIME!!!)

 

You might also connect a monitor to the server and see if there is anything on the screen after the server has an issue.  Take a picture to post up.  (Take sure that it is in focus, not blurred from camera shake and that there isn't a reflection from any light sources that conceal information!) 

Link to comment

Just waiting until I can run out and pick up some more RAM

 

The system has become unresponsive and I wont have a chance to physically check on it for a day or two.

Here is a capture (had to remote desktop to the running telnet session and copy it over, sorry if the format is not correct. I did have to trim some of the beginning of the syslog as it was over 320Kb). But it looks to me like a memory issue to me.

 

If that solves it, I may as well grab a better processor as well - any recommendations?

 

syslog_capture.txt

Link to comment

Just waiting until I can run out and pick up some more RAM

 

The system has become unresponsive and I wont have a chance to physically check on it for a day or two.

Here is a capture (had to remote desktop to the running telnet session and copy it over, sorry if the format is not correct. I did have to trim some of the beginning of the syslog as it was over 320Kb). But it looks to me like a memory issue to me.

 

If that solves it, I may as well grab a better processor as well - any recommendations?

 

If you had typed diagnostics  in that telenet session, the diagnostics file would have been saved to the Flash Drive. (The diagnostics file is zipped so it is much smaller than a straight text file.)  But I suspect that you may not have physical access the server at this point...

Link to comment

Just waiting until I can run out and pick up some more RAM

 

The system has become unresponsive and I wont have a chance to physically check on it for a day or two.

Here is a capture (had to remote desktop to the running telnet session and copy it over, sorry if the format is not correct. I did have to trim some of the beginning of the syslog as it was over 320Kb). But it looks to me like a memory issue to me.

 

If that solves it, I may as well grab a better processor as well - any recommendations?

 

If you had typed diagnostics  in that telenet session, the diagnostics file would have been saved to the Flash Drive. (The diagnostics file is zipped so it is much smaller than a straight text file.)  But I suspect that you may not have physical access the server at this point...

 

I understand, but unfortunately as I said before, everything locks up. I cannot access the webgui, telnet (existing is unresponsive - anything entered after that is a new line with no response), new telnet, console, any of the plugins, SMB etc.

Link to comment
  • 3 weeks later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.