Debian 8.4 VM: Uhhuh. NMI received for unknown reason 20 on CPU

darcysabatino · July 18, 2017

Same issue, running Debian 8. I've tried changing cores, and also tried changing CPU from passthrough to QEMU.

I've also tried changing the CPU governing mode from "Power Save" to "Performance" (using the Tips & Tweaks plugin).

No luck. I keep getting the error no matter what I do. Hope someone has an idea how to fix this!

nicr4wks · August 21, 2017

This has been driving me insane.

I read on another forum that modifying the xml and changing the clock will stop the errors:

I've added this to one of my debian vms and it's been running for 2hrs without any NMI errors, previously coming up every couple of minutes.

Edited August 21, 2017 by nicr4wks

Soulive · August 23, 2017

I was having the same issue with Ryzen and Fedora 25 guest for the last couple days. Symptoms similar to what others have experienced in this thread. Went into host BIOS and Disabled Cool n Quiet, enabled SR-IOV. In Unraid I switched to CPU scaling governor Performance. Have not had the issue all day.

Edited August 23, 2017 by Soulive

themaxxz · September 21, 2017

Hi,

Since a few days I'm running BM unraid and no longer running unRAID in a VM on ESXi and unfortunately I'm seeing the same style of messages.

Message from syslogd@host at Sep 19 21:07:24 ...
kernel:Uhhuh. NMI received for unknown reason 21 on CPU 0.

Message from syslogd@host at Sep 19 21:07:24 ...
kernel:Do you have a strange power saving mode enabled?

Message from syslogd@host at Sep 19 21:07:24 ...
kernel:Dazed and confused, but trying to continue

These messages seem to start exactly 10min after a 'warm' reboot of a centos 7 VM. If I do a cold reboot, I do not get the messages.

I've already tried the following changes to the VM, but none helped.

- setting acpi_pcm=off
- nmi_watchdog=0
- chaning pc-q35-2.7 to pc-q35-2

unRAID hw is:

- supermicro x9scm-f

- xeon E3 1230

- 32GB ECC

- 2x m1015 in it mode

nicr4wks · September 22, 2017

4 hours ago, themaxxz said:

Hi,

Since a few days I'm running BM unraid and no longer running unRAID in a VM on ESXi and unfortunately I'm seeing the same style of messages.

On 8/21/2017 at 4:35 PM, nicr4wks said:

I read on another forum that modifying the xml and changing the clock will stop the errors:

<timer name='kvmclock' present='no'/>

I've added this to one of my debian vms and it's been running for 2hrs without any NMI errors, previously coming up every couple of minutes.

themaxxz · September 22, 2017

8 hours ago, nicr4wks said:

Thanks for the tip, but I just tested this solution and it does not work for me.

Just to be sure that everybody understands what I mean with 'warm' and 'cold' reboot.

A 'warm' reboot is when I do a 'reboot' from inside the OS, without stopping the VM container.

A 'cold' reboot is the first boot after starting the VM container.

So I only have the issue when I 'warm' reboot my VM.

And the messages start after the system is booted for about 588 seconds and repeat about every 30 seconds.

But I do think the issue is indeed related to the 'clock' or 'timer' because I can consistently reproduce the issue as well as the consistent interval of the messages which in my opinion rules out a hardware issue.

Also see http://m.blog.itpub.net/35489/viewspace-84530/

"These messages occur because the kernel ignores the Advanced Configuration and Power Interface (ACPI) timer override for the system timer"

So adding to my previous list of already tried and failed tweaks:

- adding boot option clocksource=acpi_pm
- adding boot option acpi=off => results in unbootable VM
- adding <timer name='kvmclock' present='no'/> in kvm.xml

nicr4wks · October 23, 2017

Can confirm the same thing, apparently I had never rebooted a linux VM until today.

Adding the kvmclock line will stop the errors on a cold boot VM, if the VM is rebooted from within the OS (shutdown -r) then the errors return on boot.

My work around is to shutdown the guest then start the KVM back up.

darcysabatino · November 6, 2017

On 2017-10-22 at 6:36 PM, nicr4wks said:

Can confirm the same thing, apparently I had never rebooted a linux VM until today.

Adding the kvmclock line will stop the errors on a cold boot VM, if the VM is rebooted from within the OS (shutdown -r) then the errors return on boot.

My work around is to shutdown the guest then start the KVM back up.

I confirmed this works the same for me too:

- add <timer name='kvmclock' present='no'/> to the XML

- cold boot the VM

- future restarts require a hard shutdown

chris1259 · January 18, 2018

I had this issue. For a short time i had solved it by assigning it a specific number of cores. Sorry to say i don't remember how many or which ones. But the issue for me is now resolved completely, despite the cores i select, since i upgraded from v6.3.5 to v6.4.0.

Debian 8.4 VM: Uhhuh. NMI received for unknown reason 20 on CPU

Recommended Posts

darcysabatino

Link to comment

nicr4wks

Link to comment

Soulive

Link to comment

themaxxz

Link to comment

nicr4wks

Link to comment

themaxxz

Link to comment

nicr4wks

Link to comment

darcysabatino

Link to comment

chris1259

Link to comment

Join the conversation