How do I disable Nested Page Tables?


jude

Recommended Posts

I am running a Win8.1 guest with GPU passthrough of a GTX 970. The guest has 6 VCPU's from an AMD FX8350.

 

I have been fighting with occasional audio static, popping and severe frame rate drops. This has been mostly fixed by changing the timer section of my xml from

 

<clock offset='localtime'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='yes'/>
    <timer name='hypervclock' present='yes'/>
  </clock>

 

to

 

<clock offset='localtime'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>

 

HPET seems to be mostly responsible for the DPC latency and the "no" setting seems to have significantly helped. If I try to include <timer name='hypervclock' present='yes'/> then the VM will not boot. If I don't include

 

<timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>

 

then the frame rates drop to unusable levels and the audio stalls right out.

 

 

 

Over at bbs.archlinux.org user nbhs  posted

 

https://bbs.archlinux.org/viewtopic.php?pid=1270311#p1270311

 

The single most important thing i found to improve the vm performance on my amd board is to disable nested page tables:

 

echo "options kvm-amd npt=0" > /etc/modprobe.d/kvm-amd.conf

 

These are some other things i did to improve the performance:

-cpu host

Using hugetlbfs instead of transparent huge pages

Set Preemption Model to Voluntary on kernel

Set Timer frequency  1000HZ on kernel

 

The following link is from the same forum and the post shows the difference in frame rates with Nested Page Tables enabled or disabled.

 

https://bbs.archlinux.org/viewtopic.php?pid=1280259#p1280259

 

Can anyone tell me how to try to test these? I am most interested in first testing disabling Nested Page Tables and then if that is successful the other mods.

 

Thanks

 

My current xml

 

<domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
  <name>windows81nvidia</name>
  <uuid>cc411d70-4463-4db7-bf36-d364c0cdaa8b</uuid>
  <memory unit='KiB'>4194304</memory>
  <currentMemory unit='KiB'>3906250</currentMemory>
  <memoryBacking>
    <nosharepages/>
    <locked/>
  </memoryBacking>
  <vcpu placement='static'>6</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='0'/>
    <vcpupin vcpu='1' cpuset='1'/>
    <vcpupin vcpu='2' cpuset='2'/>
    <vcpupin vcpu='3' cpuset='3'/>
    <vcpupin vcpu='4' cpuset='4'/>
    <vcpupin vcpu='5' cpuset='5'/>
  </cputune>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='x86_64' machine='pc-q35-2.1'>hvm</type>
    <loader>/usr/share/qemu/bios-256k.bin</loader>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <pae/>
    <kvm>
      <hidden state='on'/>
    </kvm>
  </features>
  <cpu mode='host-passthrough'>
    <topology sockets='2' cores='6' threads='1'/>
  </cpu>
  <clock offset='localtime'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <pm>
    <suspend-to-mem enabled='no'/>
    <suspend-to-disk enabled='no'/>
  </pm>
  <devices>
    <emulator>/usr/bin/qemu-system-x86_64</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='none' io='native'/>
      <source file='/mnt/disk/vmdisk/Image Media/Win8.1ProN.qcow2'/>
      <target dev='hda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x04' function='0x0'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <source file='/mnt/disk/vmdisk/Images/en_windows_8_1_n_x64_dvd_2707896.iso'/>
      <target dev='hda' bus='sata'/>
      <readonly/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <source file='/mnt/disk/vmdisk/Images/virtio-win-0.1-100.iso'/>
      <target dev='hdd' bus='sata'/>
      <readonly/>
      <address type='drive' controller='0' bus='0' target='0' unit='3'/>
    </disk>
    <controller type='pci' index='0' model='pcie-root'/>
    <controller type='pci' index='1' model='dmi-to-pci-bridge'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1e' function='0x0'/>
    </controller>
    <controller type='pci' index='2' model='pci-bridge'>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x01' function='0x0'/>
    </controller>
    <controller type='sata' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x02' function='0x0'/>
    </controller>
    <controller type='usb' index='0' model='ich9-ehci1'>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x03' function='0x7'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci1'>
      <master startport='0'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x03' function='0x0' multifunction='on'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci2'>
      <master startport='2'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x03' function='0x1'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci3'>
      <master startport='4'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x03' function='0x2'/>
    </controller>
    <interface type='bridge'>
      <mac address='52:54:00:46:29:be'/>
      <source bridge='br0'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x01' function='0x0'/>
    </interface>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x05' function='0x0'/>
    </memballoon>
  </devices>
  <qemu:commandline>
    <qemu:arg value='-device'/>
    <qemu:arg value='ioh3420,bus=pcie.0,addr=1c.0,multifunction=on,port=1,chassis=1,id=root.1'/>
    <qemu:arg value='-device'/>
    <qemu:arg value='vfio-pci,host=01:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on'/>
    <qemu:arg value='-device'/>
    <qemu:arg value='vfio-pci,host=01:00.1,bus=pcie.0'/>
    <qemu:arg value='-device'/>
    <qemu:arg value='vfio-pci,host=00:12.2,bus=root.1,addr=00.2'/>
    <qemu:arg value='-device'/>
    <qemu:arg value='vfio-pci,host=00:12.0,bus=root.1,addr=00.1,'/>
  </qemu:commandline>
</domain>

 

 

Link to comment

I am running a Win8.1 guest with GPU passthrough of a GTX 970. The guest has 6 VCPU's from an AMD FX8350.

 

I have been fighting with occasional audio static, popping and severe frame rate drops. This has been mostly fixed by changing the timer section of my xml from

 

<clock offset='localtime'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='yes'/>
    <timer name='hypervclock' present='yes'/>
  </clock>

 

to

 

<clock offset='localtime'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>

 

HPET seems to be mostly responsible for the DPC latency and the "no" setting seems to have significantly helped. If I try to include <timer name='hypervclock' present='yes'/> then the VM will not boot. If I don't include

 

<timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>

 

then the frame rates drop to unusable levels and the audio stalls right out.

 

 

 

Over at bbs.archlinux.org user nbhs  posted

 

https://bbs.archlinux.org/viewtopic.php?pid=1270311#p1270311

 

The single most important thing i found to improve the vm performance on my amd board is to disable nested page tables:

 

echo "options kvm-amd npt=0" > /etc/modprobe.d/kvm-amd.conf

 

These are some other things i did to improve the performance:

-cpu host

Using hugetlbfs instead of transparent huge pages

Set Preemption Model to Voluntary on kernel

Set Timer frequency  1000HZ on kernel

 

The following link is from the same forum and the post shows the difference in frame rates with Nested Page Tables enabled or disabled.

 

https://bbs.archlinux.org/viewtopic.php?pid=1280259#p1280259

 

Can anyone tell me how to try to test these? I am most interested in first testing disabling Nested Page Tables and then if that is successful the other mods.

 

Thanks

 

My current xml

 

<domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
  <name>windows81nvidia</name>
  <uuid>cc411d70-4463-4db7-bf36-d364c0cdaa8b</uuid>
  <memory unit='KiB'>4194304</memory>
  <currentMemory unit='KiB'>3906250</currentMemory>
  <memoryBacking>
    <nosharepages/>
    <locked/>
  </memoryBacking>
  <vcpu placement='static'>6</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='0'/>
    <vcpupin vcpu='1' cpuset='1'/>
    <vcpupin vcpu='2' cpuset='2'/>
    <vcpupin vcpu='3' cpuset='3'/>
    <vcpupin vcpu='4' cpuset='4'/>
    <vcpupin vcpu='5' cpuset='5'/>
  </cputune>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='x86_64' machine='pc-q35-2.1'>hvm</type>
    <loader>/usr/share/qemu/bios-256k.bin</loader>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <pae/>
    <kvm>
      <hidden state='on'/>
    </kvm>
  </features>
  <cpu mode='host-passthrough'>
    <topology sockets='2' cores='6' threads='1'/>
  </cpu>
  <clock offset='localtime'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <pm>
    <suspend-to-mem enabled='no'/>
    <suspend-to-disk enabled='no'/>
  </pm>
  <devices>
    <emulator>/usr/bin/qemu-system-x86_64</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='none' io='native'/>
      <source file='/mnt/disk/vmdisk/Image Media/Win8.1ProN.qcow2'/>
      <target dev='hda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x04' function='0x0'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <source file='/mnt/disk/vmdisk/Images/en_windows_8_1_n_x64_dvd_2707896.iso'/>
      <target dev='hda' bus='sata'/>
      <readonly/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <source file='/mnt/disk/vmdisk/Images/virtio-win-0.1-100.iso'/>
      <target dev='hdd' bus='sata'/>
      <readonly/>
      <address type='drive' controller='0' bus='0' target='0' unit='3'/>
    </disk>
    <controller type='pci' index='0' model='pcie-root'/>
    <controller type='pci' index='1' model='dmi-to-pci-bridge'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1e' function='0x0'/>
    </controller>
    <controller type='pci' index='2' model='pci-bridge'>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x01' function='0x0'/>
    </controller>
    <controller type='sata' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x02' function='0x0'/>
    </controller>
    <controller type='usb' index='0' model='ich9-ehci1'>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x03' function='0x7'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci1'>
      <master startport='0'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x03' function='0x0' multifunction='on'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci2'>
      <master startport='2'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x03' function='0x1'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci3'>
      <master startport='4'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x03' function='0x2'/>
    </controller>
    <interface type='bridge'>
      <mac address='52:54:00:46:29:be'/>
      <source bridge='br0'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x01' function='0x0'/>
    </interface>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x05' function='0x0'/>
    </memballoon>
  </devices>
  <qemu:commandline>
    <qemu:arg value='-device'/>
    <qemu:arg value='ioh3420,bus=pcie.0,addr=1c.0,multifunction=on,port=1,chassis=1,id=root.1'/>
    <qemu:arg value='-device'/>
    <qemu:arg value='vfio-pci,host=01:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on'/>
    <qemu:arg value='-device'/>
    <qemu:arg value='vfio-pci,host=01:00.1,bus=pcie.0'/>
    <qemu:arg value='-device'/>
    <qemu:arg value='vfio-pci,host=00:12.2,bus=root.1,addr=00.2'/>
    <qemu:arg value='-device'/>
    <qemu:arg value='vfio-pci,host=00:12.0,bus=root.1,addr=00.1,'/>
  </qemu:commandline>
</domain>

 

My <clock> section of XML is as follows:

 

  <clock offset='localtime'>
    <timer name='hypervclock'/>
    <timer name='hpet' present='no'/>
  </clock>

 

Also, are you using latest nVIDIA drivers or 340.52?  The 340.52 drivers were the last drivers nVIDIA released that supported the use of hyper-v extensions in the VM.  If you use 340.52, you can add this in your <features> section:

 

  <features>
    <acpi/>
    <apic/>
    <hap/>
    <viridian/>
    <hyperv>
      <relaxed state='on'/>
      <vapic state='on'/>
      <spinlocks state='on' retries='8191'/>
    </hyperv>
  </features>

 

Before doing anything, add <hap/> to your <features> section in your XML.  That could significantly impact your performance by itself (this turns on hardware assisted paging).

 

As far as nested page tables go, the only way you can try this out is via command line.  Here's what you could try, but I don't have an AMD system, so this is "try at your own risk."

 

1)  Shutdown all virtual machines.

2)  Connect to your server via SSH.

3)  Stop the libvirt service with this command:

/etc/rc.d/rc.libvirt stop

4)  Run nbh's command from what you referenced before:

echo "options kvm-amd npt=0" > /etc/modprobe.d/kvm-amd.conf

5)  Restart libvirt (the new module settings should load automatically):

/etc/rc.d/rc.libvirt start

6)  Test your VM!

 

These settings will not persist after a reboot, so if this works, you can place this part in the GO script (in /boot/config/go file) before the emhttp start line:

 

echo "options kvm-amd npt=0" > /etc/modprobe.d/kvm-amd.conf

 

Please report back what you find from doing this and let us know if this does improve performance for you!

 

As far as the other settings nbh suggests:

 

Transparent Huge Pages are fine to use and do not impact the performance over hugetblfs in our testing.

 

We have not tested "preemption model" or "timer frequency" but again, haven't seen a reason to do so since performance in our internal testing has been fine.

Link to comment

Thanks so much for the detailed answers.

 

I am using the latest Nvidia drivers as the GTX 970 and other 2nd gen Maxwell series cards will only work with the newest drivers.

 

I have been really surprised by the effect that changing the timers seems to have on the smoothness of the graphics on my system. I think that it is possible that I have a defective card and that this is what is causing some of my issues with the sound (crackles, pops and sometimes a corresponding drop in frame rates). It is however evident that the timer settings have a huge impact on the frequency and severity of these events. Particularly HPET. On my system this has to be disabled.

 

I am going to carry out some bench marking this weekend to see how disabling Nested Page Tables effects the system. I will post the results once I have them.

 

I will also check the effect of adding the Hardware Assisted Paging flag.

 

Are there any other tweaks that you know of or could point me towards that I could try out? Any examples of xml files for pushing the hardware for gaming?

 

Thanks again

Link to comment

Thanks so much for the detailed answers.

 

I am using the latest Nvidia drivers as the GTX 970 and other 2nd gen Maxwell series cards will only work with the newest drivers.

 

I have been really surprised by the effect that changing the timers seems to have on the smoothness of the graphics on my system. I think that it is possible that I have a defective card and that this is what is causing some of my issues with the sound (crackles, pops and sometimes a corresponding drop in frame rates). It is however evident that the timer settings have a huge impact on the frequency and severity of these events. Particularly HPET. On my system this has to be disabled.

 

I am going to carry out some bench marking this weekend to see how disabling Nested Page Tables effects the system. I will post the results once I have them.

 

I will also check the effect of adding the Hardware Assisted Paging flag.

 

Are there any other tweaks that you know of or could point me towards that I could try out? Any examples of xml files for pushing the hardware for gaming?

 

Thanks again

 

Yes.  I just implemented a number of tweaks to Windows you can make.  The biggest impacts were removing indexing and the page file.  Here is the list:

 

1. Disable system restore points.  These aren't really necessary if you are running in a VM and can take snapshots from the host OS (although the snapshot feature is a WIP, for a gaming VM, who cares about system restore anyway).

 

2. Disable the indexing service.  This does a lot of background disk IO, and most people don't need the high speed search functionality.

 

3. Turn off Windows Features that you don't need.  This won't help disk utilization much, but can greatly improve overall system performance.

 

4. Disable the paging file.  Windows does a lot of unnecessary background paging, which can cause lots of unneeded disk IO.  Be careful

doing this however, as it may cause problems for memory hungry applications.

 

5. See if you can disable boot time services you don't need.  Bluetooth, SmartCard, and Adaptive Screen Brightness are all things you probably

don't need in a VM environment.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.