unRAID Server Release 6.0-beta4-x86_64 Available


Recommended Posts

  • Replies 263
  • Created
  • Last Reply

Top Posters In This Topic

I just experienced a failure to stop the array from the emhttp interface.  The web interface hung and couldn't be recovered.

 

I could still login via ssh, so captured the tail of the syslog:

Apr 13 01:51:45 Tower emhttp: shcmd (73): /usr/local/sbin/emhttp_event stopping_svcs
Apr 13 01:51:45 Tower kernel: mdcmd (36): nocheck 
Apr 13 01:51:45 Tower kernel: md: nocheck_array: check not active
Apr 13 01:51:45 Tower emhttp_event: stopping_svcs
Apr 13 01:51:45 Tower emhttp: Stop AVAHI...
Apr 13 01:51:45 Tower emhttp: shcmd (74): /etc/rc.d/rc.avahidaemon stop |& logger
Apr 13 01:51:45 Tower logger: Stopping Avahi mDNS/DNS-SD Daemon: stopped
Apr 13 01:51:45 Tower avahi-daemon[2720]: Got SIGTERM, quitting.
Apr 13 01:51:45 Tower avahi-dnsconfd[2728]: read(): EOF
Apr 13 01:51:45 Tower avahi-daemon[2720]: Leaving mDNS multicast group on interface br0.IPv4 with address 10.2.0.100.
Apr 13 01:51:45 Tower avahi-daemon[2720]: avahi-daemon 0.6.31 exiting.
Apr 13 01:51:45 Tower emhttp: shcmd (75): /etc/rc.d/rc.avahidnsconfd stop |& logger
Apr 13 01:51:45 Tower logger: Stopping Avahi mDNS/DNS-SD DNS Server Configuration Daemon: stopped
Apr 13 01:51:45 Tower emhttp: shcmd (76): ps axc | grep -q rpc.mountd
Apr 13 01:51:45 Tower emhttp: Stop NFS...
Apr 13 01:51:45 Tower emhttp: shcmd (77): /etc/rc.d/rc.nfsd stop |& logger
Apr 13 01:51:45 Tower rpc.mountd[3213]: Caught signal 15, un-registering and exiting.
Apr 13 01:51:46 Tower emhttp: Stop SMB...
Apr 13 01:51:46 Tower emhttp: shcmd (78): /etc/rc.d/rc.samba stop |& logger
Apr 13 01:51:46 Tower kernel: lockd: couldn't shutdown host module for net ffffffff81686ec0!
Apr 13 01:51:46 Tower kernel: nfsd: last server has exited, flushing export cache
Apr 13 01:51:46 Tower emhttp: shcmd (79): rm /etc/avahi/services/smb.service &> /dev/null
Apr 13 01:51:46 Tower emhttp: Spinning up all drives...
Apr 13 01:51:46 Tower emhttp: shcmd (80): /usr/sbin/hdparm -S0 /dev/sdd &> /dev/null
Apr 13 01:51:46 Tower kernel: mdcmd (37): spinup 0
Apr 13 01:51:46 Tower kernel: mdcmd (38): spinup 1
Apr 13 01:51:46 Tower kernel: mdcmd (39): spinup 2
Apr 13 01:51:46 Tower kernel: mdcmd (40): spinup 3
Apr 13 01:51:46 Tower kernel: mdcmd (41): spinup 4
Apr 13 01:51:46 Tower kernel: mdcmd (42): spinup 5
Apr 13 01:51:47 Tower emhttp: Sync filesystems...
Apr 13 01:51:47 Tower emhttp: shcmd (81): sync
Apr 13 01:51:56 Tower emhttp: shcmd (82): /usr/local/sbin/emhttp_event unmounting_disks
Apr 13 01:51:56 Tower emhttp_event: unmounting_disks
Apr 13 01:51:56 Tower rc.fan_speed: WARNING: fan_speed called to stop with SERVICE not = disabled
Apr 13 01:51:56 Tower rc.unRAID[8121][8122]: Processing /etc/rc.d/rc.unRAID.d/ kill scripts.
Apr 13 01:51:56 Tower rc.unRAID[8121][8126]: Running: "/etc/rc.d/rc.unRAID.d/K00.sh"
Apr 13 01:51:56 Tower rc.unRAID[8121][8129]: Shutting down domain 1
Apr 13 01:51:56 Tower rc.unRAID[8121][8129]: Shutting down domain 2
Apr 13 01:51:56 Tower rc.unRAID[8121][8129]: Waiting for 2 domains
Apr 13 01:52:05 Tower kernel: br0: port 2(vif1.0) entered disabled state
Apr 13 01:52:05 Tower rc.unRAID[8121][8129]: Domain 1 has been shut down, reason code 0
Apr 13 01:52:05 Tower kernel: br0: port 2(vif1.0) entered disabled state
Apr 13 01:52:05 Tower kernel: device vif1.0 left promiscuous mode
Apr 13 01:52:05 Tower kernel: br0: port 2(vif1.0) entered disabled state
Apr 13 01:52:05 Tower logger: /etc/xen/scripts/vif-bridge: offline type_if=vif XENBUS_PATH=backend/vif/1/0
Apr 13 01:52:05 Tower logger: /etc/xen/scripts/vif-bridge: brctl delif br0 vif1.0 failed
Apr 13 01:52:05 Tower logger: /etc/xen/scripts/vif-bridge: ifconfig vif1.0 down failed
Apr 13 01:52:05 Tower logger: /etc/xen/scripts/vif-bridge: Successful vif-bridge offline for vif1.0, bridge br0.

 

I then initiated 'shutdown now' from the ipmi console.  This took me to single user mode - I logged in and, as far as I could tell, all disks were still mounted, but not accessible over nfs. I issued a powerdown, which reported powerdown v2.06 and then hung.  No further interaction was possible, so I then powered down via ipmi.

 

Edit to add:

As might be expected, a parity check was automatically started on reboot.

Link to comment

Apr 13 01:51:56 Tower rc.unRAID[8121][8122]: Processing /etc/rc.d/rc.unRAID.d/ kill scripts.

Apr 13 01:51:56 Tower rc.unRAID[8121][8126]: Running: "/etc/rc.d/rc.unRAID.d/K00.sh"

Apr 13 01:51:56 Tower rc.unRAID[8121][8129]: Shutting down domain 1

Apr 13 01:51:56 Tower rc.unRAID[8121][8129]: Shutting down domain 2

Apr 13 01:51:56 Tower rc.unRAID[8121][8129]: Waiting for 2 domains

Apr 13 01:52:05 Tower kernel: br0: port 2(vif1.0) entered disabled state

Apr 13 01:52:05 Tower rc.unRAID[8121][8129]: Domain 1 has been shut down, reason code 0

 

It doesn't look like domain 2 shutdown.

Link to comment

Just reporting a possibly strange behavior - a parity check initiated after upgrade to 6.0-beta4:

 

I upgraded an Unraid basic install from 5.05 to 6.0-beta4. 

- Stopped the array and shutdown from the web interface (syslog attached). 

- Formatted a new key with 6.0-beta4 stock install

- Copied over config from 5.05 key backup.

- Booted into non-xen mode, and a parity check had started. 

 

Not sure why it's checking parity - did I do something wrong?

 

I was running the powerdown script that was released before all the recent updates and it always worked fine.  I had a few plugins and extras in 5.05 but nothing crazy, v6 was stock/barebones.

 

Thanks!

 

edit:  entirely possible this is nothing to do with the beta.  If you think so let me know and I'll take the discussion elsewhere...

syslog-20140412-173334.zip

Link to comment

Apr 13 01:51:56 Tower rc.unRAID[8121][8122]: Processing /etc/rc.d/rc.unRAID.d/ kill scripts.

Apr 13 01:51:56 Tower rc.unRAID[8121][8126]: Running: "/etc/rc.d/rc.unRAID.d/K00.sh"

Apr 13 01:51:56 Tower rc.unRAID[8121][8129]: Shutting down domain 1

Apr 13 01:51:56 Tower rc.unRAID[8121][8129]: Shutting down domain 2

Apr 13 01:51:56 Tower rc.unRAID[8121][8129]: Waiting for 2 domains

Apr 13 01:52:05 Tower kernel: br0: port 2(vif1.0) entered disabled state

Apr 13 01:52:05 Tower rc.unRAID[8121][8129]: Domain 1 has been shut down, reason code 0

 

It doesn't look like domain 2 shutdown.

 

Indeed, but I wonder why?  Dom2 is a second ArchVM with nothing added except I've been trying to experiment with xfce4/vnc.

 

Edit:

Actually, I'm not completely sure that is true!  I've just noticed that the Dom numbers don't follow the order in which they are registered with the Extension Manager.  Is the domain number determinate?

 

Edit 2:

I've now examined the journals of the two VMs.  It seems that Dom2 is the first VM I registered and which is running several services - of particular interest in this case is deluged.  From the journal, I learn that deluged refused to shutdown.

 

Now, the system has already shutdown at least 20 times under powerfail conditions.  Why was this different?  I can only guess - My desktop machine was running a deluge front end, connected to the daemon.  In powerfail conditions, the desktop machine is set up to powerdown a minute before the unRAID server, so the deluge FE will always stop first.  In this case, no powerfail, so the desktop machine was still running and the deluge FE was still connected - I presume that this is what caused the back end to refuse to die and, therefore, holding the 'Torrents' share open.

 

So, not an unRAID problem - I need to examine this from the deluge end and it may be that a "killall -9 deluged" is going to be necessary.

Link to comment

Just reporting a possibly strange behavior - a parity check initiated after upgrade to 6.0-beta4:

 

I upgraded an Unraid basic install from 5.05 to 6.0-beta4. 

- Stopped the array and shutdown from the web interface (syslog attached). 

- Formatted a new key with 6.0-beta4 stock install

- Copied over config from 5.05 key backup.

- Booted into non-xen mode, and a parity check had started. 

 

Not sure why it's checking parity - did I do something wrong?

 

I was running the powerdown script that was released before all the recent updates and it always worked fine.  I had a few plugins and extras in 5.05 but nothing crazy, v6 was stock/barebones.

 

Thanks!

 

edit:  entirely possible this is nothing to do with the beta.  If you think so let me know and I'll take the discussion elsewhere...

 

You only included a v5.0.5 syslog, so cannot tell anything about how well v6.0-beta4 booted.  There appears to be a good shutdown from v5.0.5, so there should not have been any problems on the subsequent boot.  There is however a small oddity in this v5 syslog, with the 2 following lines (probably not important, but worth mentioning):

Apr 12 17:33:28 barad-dur emhttp: mdcmd: write: Invalid argument

... and

Apr 12 17:33:29 barad-dur emhttp: mdcmd: write: Invalid argument

There are no clues as to what write was attempted, and no clue as to where.  I suppose it's possible this was a write to the super.dat file on the flash drive, which could potentially cause a parity check, but no proof here at all, and seems very unlikely.

Link to comment

Just reporting a possibly strange behavior - a parity check initiated after upgrade to 6.0-beta4:

 

...

You only included a v5.0.5 syslog, so cannot tell anything about how well v6.0-beta4 booted.  There appears to be a good shutdown from v5.0.5, so there should not have been any problems on the subsequent boot.  There is however a small oddity in this v5 syslog, with the 2 following lines (probably not important, but worth mentioning):

Apr 12 17:33:28 barad-dur emhttp: mdcmd: write: Invalid argument

... and

Apr 12 17:33:29 barad-dur emhttp: mdcmd: write: Invalid argument

There are no clues as to what write was attempted, and no clue as to where.  I suppose it's possible this was a write to the super.dat file on the flash drive, which could potentially cause a parity check, but no proof here at all, and seems very unlikely.

 

Good point on the start up log, I should have known better.  Of course, I do not have it because I didn't think to save it and haven't gotten powerdown up and running yet on 6b4, so that's not good...  Best I could do is revert to my 5.05 backup and test to see if I can make the same behaviour occur again.  Might try it later this week if I have time as I'm on the road till Wednesday at least.

 

Thanks for the help!

Link to comment

Twice now when I took a VM (Win8) without any PCI passthrough, shut it down, uncommented the passthrough line, and started up the VM, the system became 100% unresponsive. A running tail on the syslog on the console showed nothing, console unresponsive. Reset button didn't even work, had to hold down the power button.

 

System starts up fine (though the first time took awhile replaying transactions) and the Win8 VM with the PCI passthrough started up without issue.

Link to comment

Running smooth for now. I really like the new Xen manager :) It would be awesome to implement xl top in the same fashion.

 

One question though : is the issue with NFS fixed in this beta? I won't have time to try before next week...

 

I'll answer myself : it's not fixed... I keep on getting "Download registered as completed, but hash check returned unfinished chunks." on rTorrent, and similar issues with CouchPotato, that I didn't have with SMB...

 

Do you think that it can be fixed with NFSv4? Any news on this topic?

Link to comment

Twice now when I took a VM (Win8) without any PCI passthrough, shut it down, uncommented the passthrough line, and started up the VM, the system became 100% unresponsive. A running tail on the syslog on the console showed nothing, console unresponsive. Reset button didn't even work, had to hold down the power button.

 

System starts up fine (though the first time took awhile replaying transactions) and the Win8 VM with the PCI passthrough started up without issue.

 

This seems to be something tied to PCI passthrough. Whenever I've had to force destroy a Win8 VM that has PCI passthrough in use because it would not terminate, starting it up would cause a total system hang requiring the power button to shut down, reset button unresponsive.

 

Pretty sure that this has nothing to do with UNRAID or how it implements Xen, but likely something with Xen itself. Mentioning it in case someone else runs into the same issue.

Link to comment

I'm not sure whether to post here or in the Xen area.

 

My unRAID server suddenly became unreachable through the network.

 

I'm running 6.0-beta4 on the hardware detailed in my .sig.  I'm running three virtual machines - two of IronicBadger's ArchVM pre-rolled images, and another running WinXP.  This XP machine was installed new yesterday.

 

One of the ArchVM machines is running a number of services - MySQL (MariaDB), Deluged, minidlna, LogitechMediaServer.

 

Dom0 is running a few plugins: apcupsd, tftp-hpa, fan_speed, dovecot and mpop.

 

The other two VMs are virtually dormant.  I did have a vnc connection open to the WinXP machine.  The second ArchVM has xfce loaded with tigervnc server - there were no connections to this.

 

I have been running beta4 for more than two weeks, and the first ArchVM for almost as long.  tftp-hpa has been running for more than a week -  and the other plugins have been running since day 1 with beta4, and for many months on v5.0.

 

I did have ssh sessions open to Dom0 and the two ArchVMs.

 

There was an XBMC media server playing a video from a user share.

 

I was playing with the Dom0, via ssh, trying to enable the session name to appear in the title bar of my Gnome terminal.

 

All of a sudden, everything stopped - XBMC stopped playing, my Squeezeboxen blanked, my deluge transfers froze etc.  I was unable to access any shares from my Ubuntu desktop, and all the ssh sessions froze.  I couldn't get a response from the Tower emhttp interface.  Tower would respond to pings, but not any of the VMs.

 

I went to ipmi and was able to capture the attached screen image.  I was aware that fan-speed was still active because I could hear the fans speeding up - the drives must have been getting warm(er).

 

I didn't have the presence of mind to attempt any interaction via ipmi!  :-[

 

I did a soft reset from ipmi and everything came back up as normal, with a parity check running.

 

Does anyone have any idea what may have caused this?

TowerHang.jpg.f48da45d5842c3769877f2d15802e83e.jpg

Link to comment

Hi Pieter I had a similar experience where all of a sudden all VM's were inaccessible and I couldn't connect to the webui, could only ping dom0, ended up doing a hard reset. Now have tail of syslog on another machine just in case it happens again. Oh and I have no plugins running at all.

 

 

Edit sorry forgot to mention ssh was also dead so only response was ping to dom0. I apologise for the lack of concrete evidence and realise this post is fairly useless as far as diagnosing the issue goes, hope to have more detail of it happens again.

 

Quick thought, could it be caused by the workaround Peter and myself have put in place for the dhcp delay issue caused by STP?

 

Link to comment

I too had the same problem, couldn't get logs because it had totally frozen.

The "xen: vector 0x2 is not implemented" line is what I remember from the console.

Had to hard reset to get it back running.

I thought maybe it was overheating (being in a closet), so I moved it to a corner in my living room and blew some compressed air to clean the vents and fans.

Since then, no moore freezes/hangups...

Had no issues with version 4 and 5 for a couple of years.

Link to comment

 

I had the same issue, posted a thread about it but got no feedback. Like binhex had to do a hard reset but got no logs. Remains a mystery.

 

 

Sent from my iPhone using Tapatalk

 

Did you have the dhcp delay workaround in place by any chance too?

Come to think of it, I did. What I'm not positive on was whether I had just switched to Tom's method or not.

 

 

Sent from my iPhone using Tapatalk

Link to comment

Rebuilt my Unraid server today, removed some smaller drives (anything under 2tb, aside from my one smaller cache drive).  Noticed a marked increase in my parity rebuild (went from around 50 mb/s to 95 mb/s)  I'm wondering if this is due to going from 11 drives down to 8, using a faster drive for parity, or not using the onboard ports, only the ones on my MV8 card).

 

Got 2 VM's running on it currently (both fedora20-based, one with plex running on it, the other mediabrowser).

 

My system consists of:

AMD FX-8320 8-core CPU

24gb Ram

1 4tb WD Red, 1 3tb WD Red, 1 500gb WD Velociraptor, and 5 2tb WD Greens.

 

I have the 2 VM's set up with 6vcpu's each, and each have 8gb ram dedicated (overkill, I'll most likely lower it at some point).  Plex is running without any issues, have a few hiccups with MB, but it's in beta, so it's most likely on their end.

 

Thus far I've only seen one issue with b4, when I added my 2 domains to the UI, the names had spaces in them, which caused issues with the display in the Xen section.

 

 

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.