The whole server freezes


thany

Recommended Posts

Sorry forgot to mention there, I've switched to AVG antivirus. I've made sure to have cleared all traces of Avast.

 

Btw, I've found a syslog.txt ;D

I'll attach it just to be sure. But from what I can read, it doesn't look like anything wack is happening right before the crash.

 

One thing that looks off to me, is the fact that my VM called Programmatica2 (weird name, don't ask) is constantly utilizing 10-15% CPU time, but I don't ever see that much sustained CPU use from within the VM. Could it be some misconfig that offputs the entire host OS somehow?

 

/edit

Also helpful to know, inside that VM I'm running Veracrypt with HW-accelerated AES encryption. HW-accel is working fine, but who knows it might help to know this. Also, iSCSI, but that's just networking.

 

/edit2

Let's attach the file as well :)

syslog.zip

Link to comment

Decided to perform a parity check:

 

Last check completed on Thursday, 19 January 2017, 17:52 (today), finding 1095 errors.

 

Is this a problem, or is it just the result of the last crash?

 

I've also found some "interesting" warnings. One of my VMs has this in its log file:

2017-01-19 16:02:18.405+0000: starting up libvirt version: 1.3.1, qemu version: 2.5.1, hostname: unraid
LC_ALL=C PATH=/bin:/sbin:/usr/bin:/usr/sbin HOME=/ QEMU_AUDIO_DRV=none /usr/local/sbin/qemu -name Programmatica2 -S -machine pc-q35-2.5,accel=kvm,usb=off,mem-merge=off -cpu host,hv_time,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff,hv_vendor_id=none -drive file=/usr/share/qemu/ovmf-x64/OVMF_CODE-pure-efi.fd,if=pflash,format=raw,unit=0,readonly=on -drive file=/etc/libvirt/qemu/nvram/8adbf5cd-39ea-9b99-c27e-f38a48c8a5bf_VARS-pure-efi.fd,if=pflash,format=raw,unit=1 -m 4096 -realtime mlock=off -smp 3,sockets=1,cores=3,threads=1 -uuid 8adbf5cd-39ea-9b99-c27e-f38a48c8a5bf -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-Programmatica2/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime -no-hpet -no-shutdown -boot strict=on -device i82801b11-bridge,id=pci.1,bus=pcie.0,addr=0x1e -device pci-bridge,chassis_nr=2,id=pci.2,bus=pci.1,addr=0x1 -device ich9-usb-ehci1,id=usb,bus=pcie.0,addr=0x7.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pcie.0,multifunction=on,addr=0x7 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pcie.0,addr=0x7.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pcie.0,addr=0x7.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.2,addr=0x2 -drive file=/mnt/user/domains/Programmatica2/vdisk1.img,format=raw,if=none,id=drive-virtio-disk2,cache=writeback -device virtio-blk-pci,scsi=off,bus=pci.2,addr=0x3,drive=drive-virtio-disk2,id=virtio-disk2,bootindex=1 -netdev tap,fd=21,id=hostnet0,vhost=on,vhostfd=22 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:cd:70:c9,bus=pci.2,addr=0x1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/domain-Programmatica2/org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -device usb-tablet,id=input0 -vnc 0.0.0.0:0,websocket=5700 -k en-us -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vgamem_mb=16,bus=pcie.0,addr=0x1 -device virtio-balloon-pci,id=balloon0,bus=pci.2,addr=0x4 -msg timestamp=on
Domain id=1 is tainted: high-privileges
Domain id=1 is tainted: host-cpu
char device redirected to /dev/pts/0 (label charserial0)
((null):3221): SpiceWorker-Warning **: red_worker.c:163:rendering_incorrect: rendering incorrect from now on: get_drawable
((null):3221): SpiceWorker-Warning **: red_worker.c:163:rendering_incorrect: rendering incorrect from now on: failed to get_drawable
ehci warning: guest updated active QH

 

That's my Win10 VM. Especially "Domain id=1 is tainted" and "rendering incorrect from now on" don't look good. My other VM running Linux only has the domain-tainted-warning.

 

Furthermore, in Tools -> Syslog, I found some questionable entries:

Jan 19 17:02:05 unraid kernel: ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Jan 19 17:02:05 unraid kernel: ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20150930/psargs-359)
Jan 19 17:02:05 unraid kernel: ACPI Error: Method parse/execution failed [\_SB.PCI0.SAT0.SPT2._GTF] (Node ffff88040d850398), AE_NOT_FOUND (20150930/psparse-542)
Jan 19 17:02:05 unraid kernel: ata3.00: supports DRM functions and may not be fully accessible
Jan 19 17:02:05 unraid kernel: ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20150930/psargs-359)
Jan 19 17:02:05 unraid kernel: ACPI Error: Method parse/execution failed [\_SB.PCI0.SAT0.SPT3._GTF] (Node ffff88040d850410), AE_NOT_FOUND (20150930/psparse-542)
Jan 19 17:02:05 unraid kernel: ata4.00: supports DRM functions and may not be fully accessible
Jan 19 17:02:05 unraid kernel: ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20150930/psargs-359)
Jan 19 17:02:05 unraid kernel: ACPI Error: Method parse/execution failed [\_SB.PCI0.SAT0.SPT0._GTF] (Node ffff88040d8502a8), AE_NOT_FOUND (20150930/psparse-542)
Jan 19 17:02:05 unraid kernel: ata1.00: supports DRM functions and may not be fully accessible
Jan 19 17:02:05 unraid kernel: ata1.00: ATA-10: Crucial_CT275MX300SSD1,         1641143839B5,  M0CR031, max UDMA/133
Jan 19 17:02:05 unraid kernel: ata1.00: 537234768 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
Jan 19 17:02:05 unraid kernel: ata3.00: ATA-10: Crucial_CT275MX300SSD1,         1641143833A0,  M0CR031, max UDMA/133
Jan 19 17:02:05 unraid kernel: ata3.00: 537234768 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
Jan 19 17:02:05 unraid kernel: ata4.00: ATA-10: Crucial_CT275MX300SSD1,         164114383BE1,  M0CR031, max UDMA/133
Jan 19 17:02:05 unraid kernel: ata4.00: 537234768 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
Jan 19 17:02:05 unraid kernel: ata2: SATA link down (SStatus 0 SControl 300)
Jan 19 17:02:05 unraid kernel: ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20150930/psargs-359)
Jan 19 17:02:05 unraid kernel: ACPI Error: Method parse/execution failed [\_SB.PCI0.SAT0.SPT0._GTF] (Node ffff88040d8502a8), AE_NOT_FOUND (20150930/psparse-542)
Jan 19 17:02:05 unraid kernel: ata1.00: supports DRM functions and may not be fully accessible
Jan 19 17:02:05 unraid kernel: ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20150930/psargs-359)
Jan 19 17:02:05 unraid kernel: ACPI Error: Method parse/execution failed [\_SB.PCI0.SAT0.SPT2._GTF] (Node ffff88040d850398), AE_NOT_FOUND (20150930/psparse-542)
Jan 19 17:02:05 unraid kernel: ata3.00: supports DRM functions and may not be fully accessible
Jan 19 17:02:05 unraid kernel: ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20150930/psargs-359)
Jan 19 17:02:05 unraid kernel: ACPI Error: Method parse/execution failed [\_SB.PCI0.SAT0.SPT3._GTF] (Node ffff88040d850410), AE_NOT_FOUND (20150930/psparse-542)
Jan 19 17:02:05 unraid kernel: ata4.00: supports DRM functions and may not be fully accessible
Jan 19 17:02:05 unraid kernel: ata1.00: configured for UDMA/133

 

And

 

Jan 19 17:48:32 unraid kernel: md: recovery thread: P corrected, sector=440961576
Jan 19 17:48:32 unraid kernel: md: recovery thread: P corrected, sector=440961584
Jan 19 17:48:32 unraid kernel: md: recovery thread: P corrected, sector=440961592

Times a whole lot.

 

Could any of this be causing my crashes? Or should we be looking totally elsewhere?

Link to comment

Just found this topic on the same issue where between the lines I seem to read that AVG is causing similar problems. Can anyone confirm this?

 

Surely, I must not be the only person running antivirus inside a VM, right? :-\

 

Alternatively, the cause of these crashes could still be somewhere else of course. The question is where. And more importantly, how to diagnose?

 

For the record, I am running Kaspersky inside my Windows 10 VM without issue. Never tried Avast or AVG inside a VM though...

Link to comment

Are you able to leave the VM off for any period to see if that makes things more stable?

I'd rather not for too long, but in a little while I'll be able to do without it for maybe a week or so. I could try that if the problem isn't resolved by other means by then.

 

In the mean time, I've tried another thing, which is to uninstall and completely wipe AVG on this VM, and revert back to Defender. Surely, Defender must work without issues since it's a default Windows component. Let's see how that goes. So far, so good. No crash/freeze for a little under 4 days. But I'm not yet convinced due to the random nature of these crashes.

Link to comment
  • 2 weeks later...

Had the same Problem with Avast, freezed my hole unraid server 3 times before i understood what was the problem. It was my first try of unraid after a xenserver setup, because i wanted the easier hardware passtrough of unraid... so 3 Months ago i setup the server a 4th time and used AVG for the Windows 7 VM and it worked without any problems.

 

But now AVG was uninstalled and reinstalled and now the same thing happens with the newest AVG version (AVG Free Version 1606)... It looks to me this problem is known since more than a year and there is still no fix or at least a note or warning from Lime Tech regarding this issue? The biggest problem is, after freezing my server this issue freezes my hole network, with avast the network stops working immediately, today with AVG the network stopped working after 10 minutes after the freeze occurred...

 

I haven't bought Unraid till now, i will search for the rest of the day for a solution but if can't find one on this forum i will switch to a more "usable", "free" and most important STABLE solution like XEN or vSphere!

Link to comment

Windows 10 has Windows Defender (one of the better AV programs) built in, so running any additional AV on Windows 10 is totally pointless and counter-productive. 

 

In other words, uninstall all that shit.  That crap shouldn't even install on Windows 10, it's totally pointless.

 

 

Anecdote time: When I used to work in a mom'n'pop PC shop (back when those were a thing!), I used to get machines in that were "running slowly".  Now, these were fast machines for the day, ones I'd built myself.  Pentium 3 CPUs, plenty of RAM, fastest 7200rpm drives we could get.  Booted up the machine and some of them had no less than 5 different antivirus programs running, including Norton's.  No wonder the machine was slow, every file being accessed was being scanned at least 5 times...

Link to comment

Windows 10 has Windows Defender (one of the better AV programs) built in,...  ...That crap shouldn't even install on Windows 10, it's totally pointless.

If you practice "skeptical computing" you may bee right, but still it's wrong to recommend to do it. There are way more tests, if read that place Windows Defender at the end of the list of all competitors! What if i already own a paid copy of a security suit, it should be possible to use it right? And the problem isn't the Antivirus it's the implementation of KVM, Lime tech needs to disable nested virtualization, see this post at avast forums, heres a excerpt:

I also confirmed that disabling nested virtualization makes it work. Following guidelines from here https://kashyapc.com/2012/01/14/nested-virtualization-with-kvm-intel/
Link to comment

Had the same Problem with Avast, freezed my hole unraid server 3 times before i understood what was the problem. It was my first try of unraid after a xenserver setup, because i wanted the easier hardware passtrough of unraid... so 3 Months ago i setup the server a 4th time and used AVG for the Windows 7 VM and it worked without any problems.

 

But now AVG was uninstalled and reinstalled and now the same thing happens with the newest AVG version (AVG Free Version 1606)... It looks to me this problem is known since more than a year and there is still no fix or at least a note or warning from Lime Tech regarding this issue? The biggest problem is, after freezing my server this issue freezes my hole network, with avast the network stops working immediately, today with AVG the network stopped working after 10 minutes after the freeze occurred...

 

I haven't bought Unraid till now, i will search for the rest of the day for a solution but if can't find one on this forum i will switch to a more "usable", "free" and most important STABLE solution like XEN or vSphere!

 

Incompatibilities such as being eluded to here require careful study, and jumping to conclusions is easy to do. You documented your case pretty well until you said that your server froze and that caused your network to stop working. This seems particularly unlikely. Immediately after reading that, the snarky thought "did it also cause it to start raining" popped into my head. And then you advocate another technology (Xen) which was tried here and proved problematic.

 

These parting comments tend to weaken your argument IMHO. There are people here who know the VM technology well, and are particularly helpful in assisting others..There may be some issues with certain antivirus products in VMs. LimeTech has extremely limited ability (besides raising tickets on the affected components) to affect a fix. To support that you'd need to provide logs, , screen shots, and A-B scenario tests that help point to the problem and share those in a coherert manner. Or explore across the net people with similer problems reporting their findings sill pointing to a common denominator / component.

 

Threats to move to another platform don't really work here. The most active forum members are community members, and receive zero compensation from LimeTech. We just like the product and enjoy participating in this forum.

 

So please try to continue the discussion without resorting to unrealistic cause/effect with no documented evidence and threats to take your business elsewhere. Stick to the facts and try to gather evidence to aid in isolation and hopefully resolution of this issue.

 

The other thing I'll mention is that LimeTech is very good about releasing new versions of OS and other integrated components. You might monitor those to see if the nature of the problem changes with different versions.

 

Best of luck.

Link to comment

Windows 10 has Windows Defender (one of the better AV programs) built in,...  ...That crap shouldn't even install on Windows 10, it's totally pointless.

If you practice "skeptical computing" you may bee right, but still it's wrong to recommend to do it. There are way more tests, if read that place Windows Defender at the end of the list of all competitors! What if i already own a paid copy of a security suit, it should be possible to use it right? And the problem isn't the Antivirus it's the implementation of KVM, Lime tech needs to disable nested virtualization, see this post at avast forums, heres a excerpt:

I also confirmed that disabling nested virtualization makes it work. Following guidelines from here https://kashyapc.com/2012/01/14/nested-virtualization-with-kvm-intel/

Nested Virtualization IS disabled by default in the 6.3.1 release.
Link to comment

Best of luck.

Thanks! I'm sorry leaving the reasonable discussion for a bit. There is no point in this

No, you're still running two AV programs at the same time. But, clearly you're correct, I'm wrong.  Even though it's you that has a problem running multiple AV programs on a VM that's crashing horribly, and I have no such problems with my VMs crashing because I'm only running a single AV program. :o

 

After searching and reading the hole day (it's 2:36pm in Austria now) i guess i understand the problem is the nested virtualization of KVM. Reffering to the last posts here, this is already disabled in 6.3.1 but I'm using 6.2.4. Is it normal that you have to update manually and how did i miss that there would have been a update, doesn't unraid recommends new updates after the are released?

 

Could i try the latest version of unraid on my server with the expired license or do i have to re-setup the hole machine again!? At the End i would point out I'm native German-speaking, my english isn't that bad but i guess this could cause some miss understandings [propose the rain ;)]

Link to comment

Best of luck.

Thanks! I'm sorry leaving the reasonable discussion for a bit. There is no point in this

No, you're still running two AV programs at the same time. But, clearly you're correct, I'm wrong.  Even though it's you that has a problem running multiple AV programs on a VM that's crashing horribly, and I have no such problems with my VMs crashing because I'm only running a single AV program. :o

 

After searching and reading the hole day (it's 2:36pm in Austria now) i guess i understand the problem is the nested virtualization of KVM. Reffering to the last posts here, this is already disabled in 6.3.1 but I'm using 6.2.4. Is it normal that you have to update manually and how did i miss that there would have been a update, doesn't unraid recommends new updates after the are released?

 

Could i try the latest version of unraid on my server with the expired license or do i have to re-setup the hole machine again!? At the End i would point out I'm native German-speaking, my english isn't that bad but i guess this could cause some miss understandings [propose the rain ;)]

 

You can request to extend your trial license. I'm sure it would not be a problem. Sorry not able to look up the link at this time, but it should be easy to find or someone else may be able to provide a link to the instructions.

 

Good luck. Wishing you a sunny day!

Link to comment

Best of luck.

Thanks! I'm sorry leaving the reasonable discussion for a bit. There is no point in this

No, you're still running two AV programs at the same time. But, clearly you're correct, I'm wrong.  Even though it's you that has a problem running multiple AV programs on a VM that's crashing horribly, and I have no such problems with my VMs crashing because I'm only running a single AV program. :o

 

After searching and reading the hole day (it's 2:36pm in Austria now) i guess i understand the problem is the nested virtualization of KVM. Reffering to the last posts here, this is already disabled in 6.3.1 but I'm using 6.2.4. Is it normal that you have to update manually and how did i miss that there would have been a update, doesn't unraid recommends new updates after the are released?

 

Could i try the latest version of unraid on my server with the expired license or do i have to re-setup the hole machine again!? At the End i would point out I'm native German-speaking, my english isn't that bad but i guess this could cause some miss understandings [propose the rain ;)]

 

You can request to extend your trial license. I'm sure it would not be a problem. Sorry not able to look up the link at this time, but it should be easy to find or someone else may be able to provide a link to the instructions.

 

Good luck. Wishing you a sunny day!

If you check for updates on the plugins page it should tell you about the update. Also, it is a good idea to read the release thread when updating. And reading the forum in general is also worthwhile in my opinion.

 

There are stickies at the top of the Presales subforum about licensing.

 

Link to comment

So everything is back working as intended! Here is what i did!

 

Rebooted the frozen UnRaid (6.2.4) and checked for bios updates while reboot, but it's uptodate. After boot i clicked on "Expired" on the top left and followed the procure to renew the trail-license for additional 14 days. Followed by a "Check for Updates" under the "Plugins" tab, run the update to "UnRaid 6.3.1" completed with another reboot. After that the server started properly, the Array was fine btw, all dockers run and the problematic VM was off because autorun was disabled. I started the VM and it worked, it booted to Windows 7 Desktop right away and the before updated AVG (Free 1606) antivirus was and is still running!

 

After all the trouble I'm very happy now. I intend to uninstall AVG and try my Avast Internet Security Suit next week and I'm optimistic it will work! After that i intend to buy a plus license and support Lime tech and its community that way. Thanks again for the critical, constructive and fast responses from all of you.  :-*

Link to comment

So everything is back working as intended! Here is what i did!

 

Rebooted the frozen UnRaid (6.2.4) and checked for bios updates while reboot, but it's uptodate. After boot i clicked on "Expired" on the top left and followed the procure to renew the trail-license for additional 14 days. Followed by a "Check for Updates" under the "Plugins" tab, run the update to "UnRaid 6.3.1" completed with another reboot. After that the server started properly, the Array was fine btw, all dockers run and the problematic VM was off because autorun was disabled. I started the VM and it worked, it booted to Windows 7 Desktop right away and the before updated AVG (Free 1606) antivirus was and is still running!

 

After all the trouble I'm very happy now. I intend to uninstall AVG and try my Avast Internet Security Suit next week and I'm optimistic it will work! After that i intend to buy a plus license and support Lime tech and its community that way. Thanks again for the critical, constructive and fast responses from all of you.  :-*

 

8) (sun shining)

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.