Lake-end Posted August 25, 2016 Share Posted August 25, 2016 Background: I have been using unRAID for few months now, I´m on Basic with couple HDDs and couple SSDs. SSDs are BTRFS with RAID0 for files and RAID1 for metadata. I run my main OS (Windows10) as VM on it as a daily driver. I also have Plex running as docker image, USB controller passthrough and coretemp plugin, that is about the extent of my "customization" from vanilla unRAID 6.1.9 My home network consists of LTE modem and ADSL modem connected to load balancing router to which my unRAID machine connects directly. unRAID is mapped with static IP. What happens: This only seems to happen when I am actively using the Win10 VM ie. I am doing something on the computer, I havent noticed this behavior ever when its sitting idle during nights etc. Windows OS will freeze, straight up lock up reacting to nothing. When I try switching my monitor input to show unRAID terminal it wont accept keyboard input nor is the "tail -f /var/log/syslog" moving, the cursor in the bottom left is still blinking tho. Since network is completely down and it wont accept keyboard input I am not actually sure if the whole unRAID has locked up, or just the VM. As I said syslog is not reacting, but the "cursor" is still blinking. EDIT: Considering pushing power button doesnt close the system like it should it seems whole computer has locked up, unRAID and all. Normally if I press power button unRAID powers off nicely. Now what is really peculiar is that when this happens, it also takes my home network down with it. I have no frigging clue how this is even possible, but indeed when my VM locks up all other devices in the network lose their internet connection. At first I was thinking that the cause must be the internet/network outage which is somehow crashing the VM, makes lot more sense, right? But no, if I reboot all my network related gear and try to get connection on other devices working, it will stay down until I disconnect unRAID computers network cable. Weird huh? I have no clue how could locked up unRAID cause the network to die, and keep it dead until I either remove the network cable or hard reset unRAID computer. You might say this has been just a coincidence, but this has happened at least 3-4 times already. I think there are two different things here, which are related but need to be solved. 1) Why is the Windows10 VM locking up? 2) Why does the VM or the unRAID manage to kill my network? For the 1) I have a slight hunch that it might be related to memory consumption. I have total of 16gb of RAM on the machine, out which 12GB is allocated to Windows10 VM and rest are free for unRAID itself. Last time this lockup happened, I was playing the new Deus Ex game, it eats fair bit of memory and when I opened my Chrome browser the lockup happened. My Chrome has like 30 open tabs, it will consume something like 6gb of RAM when fully loaded. So I have a feeling I might have run over that 12gb, but I assume Windows10 in default settings should be able to use pagefile or something so honestly I dont know why it would lockup. For the 2) I dont have any ideas, also I cant beging to understand how the unRAID box could kill my network. It seems to nuke all active devices in way that their connection gets interrupted and they wont be able to get working connection no matter if I reboot them or the network gear, only thing that seems to help is either pulling the network cable from unRAID or hard resetting it, then rebooting network gear gives me working network on all devices. If someone smarter than me could float ideas on whats causing this and how is the 2) even possible I would REALLY appreciate it. EDIT: Attached the diagnostics zip, let me know if something else would be helpful. megathron-diagnostics-20160824-2330.zip Quote Link to comment
lionelhutz Posted August 25, 2016 Share Posted August 25, 2016 Just saw this and can only respond quickly. I've had the same problem without running any VM's so the VM isn't the issue. Is your system Skylake based? Quote Link to comment
lionelhutz Posted August 30, 2016 Share Posted August 30, 2016 I disabled the C-states in the bios and my server has run without locking up for 9 days so far. Previous best was about 3-4 days before I would find it locked up. I'm not sure what impact this has on energy use but I expect little for me since I have some dockers that are always working. I'm going to try and get the power meter on it soon to see if it does make a difference. Quote Link to comment
ijuarez Posted August 30, 2016 Share Posted August 30, 2016 what your networking scheme, also sounds like a broadcast storm. Quote Link to comment
DarkKnight Posted August 31, 2016 Share Posted August 31, 2016 what your networking scheme, also sounds like a broadcast storm. I second this. Look at your network switch. Is it going ape-s**t? A couple possible causes are two DHCP servers issuing IPs on the same subnet or conflicting IPs. If your VMs are using dynamically assigned IPs, and you suspend them, it's possible that the IP that was on that VM is getting reassigned and when you wake the VM there is a conflict because it doesn't ask for another new IP. The next time it happens, do a 'ipconfig /release' and 'ipconfig /renew' on the VM, or just reboot it. See if that resolves the issue. If it does, stop suspending the VM and/or assign it a static IP. Quote Link to comment
FrozenGamer Posted September 15, 2016 Share Posted September 15, 2016 I have had something similar happen a good number of times. I don't run any vms though. Also at one point it was happening because of a duplicated IP address. At this point, maybe once every 3 to 6 months i have to unplug the server to get any computers on the network to have internet access. Usually a reboot solved the problem or several reboots and sometimes the server itself is having an unrelated crash (but not all of the time). Quote Link to comment
Lake-end Posted September 29, 2016 Author Share Posted September 29, 2016 Just saw this and can only respond quickly. I've had the same problem without running any VM's so the VM isn't the issue. Is your system Skylake based? Sorry for the late reply, I indeed have a Skylake, 6700k. Just saw this and can only respond quickly. I've had the same problem without running any VM's so the VM isn't the issue. Is your system Skylake based? Sorry for the late reply, I indeed have a Skylake, 6700k. You also happen to have Skylake CPU? what your networking scheme, also sounds like a broadcast storm. I second this. Look at your network switch. Is it going ape-s**t? A couple possible causes are two DHCP servers issuing IPs on the same subnet or conflicting IPs. If your VMs are using dynamically assigned IPs, and you suspend them, it's possible that the IP that was on that VM is getting reassigned and when you wake the VM there is a conflict because it doesn't ask for another new IP. The next time it happens, do a 'ipconfig /release' and 'ipconfig /renew' on the VM, or just reboot it. See if that resolves the issue. If it does, stop suspending the VM and/or assign it a static IP. The VM freezes, ie locks up, totally. Also like I said earlier, unfortunately I cannot do anything on the Unraid itself. Maybe I need to have second keyboard(ps2?) on the machine and see if I can get anything done on it. Remember network dies I can not SSH in it and Windows VM has claimed the keyboard. But then again when this lockup happens, pushing power button doesnt close the system like it should, in normal situation unRAID closes nicely and system powers off, in this situtation nothing happens and I need to hold the powerbutton down for it to poweroff. Quote Link to comment
Lake-end Posted September 29, 2016 Author Share Posted September 29, 2016 I disabled the C-states in the bios and my server has run without locking up for 9 days so far. Previous best was about 3-4 days before I would find it locked up. I'm not sure what impact this has on energy use but I expect little for me since I have some dockers that are always working. I'm going to try and get the power meter on it soon to see if it does make a difference. Are you still running without lockups? Quote Link to comment
lionelhutz Posted September 29, 2016 Share Posted September 29, 2016 Yes, no issues since disabling the C states. I'm not sure why because I never spent any more time trying to find out, but in my case it was definitely a C-state issue. Quote Link to comment
Sirc124 Posted November 22, 2016 Share Posted November 22, 2016 Just checking in, has there been any crashes since disabling C-State? Having the same issue with the crashes taking down my network. Quote Link to comment
lionelhutz Posted November 23, 2016 Share Posted November 23, 2016 The server ran 24/7 from the end of September until Friday and then it locked-up again. I reset it and it did it again on Saturday. So, it appeared to be good but isn't and I still don't know why. There was a BIOS update for the motherboard so maybe it'll help. Quote Link to comment
Lake-end Posted November 23, 2016 Author Share Posted November 23, 2016 Since my last post I havent been crashing. I went through my network assigning static IPs for everything and updated my BIOS, not sure which one cured it. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.