High DPC Latency - Various Issues with VMs


Recommended Posts

Hi all,

 

I've been using unRAID for sometime now and I think it's fantastic, however, I've been battling with my GPU passthrough VMs for quite some time now and I'm throwing in the towel! I'm hoping you guys can point me towards a resolution... I've scoured this forum for all sorts of different resolutions but I'm still having issues.

 

These two GPU passthrough VMs have different issues, but probably related to the same overall issue, which I'm suspecting is high DPC latency. I'll first give a quick summary of my unRAID server:

 

M/B: Supermicro - X9SRL-F
CPU: Intel® Xeon® CPU E5-2670 0 @ 2.60GHz (16 threads)
HVM: Enabled
IOMMU: Enabled
Cache: 512 kB, 2048 kB, 20480 kB
Memory: 32 GB (max. installable capacity 512 GB)
Network: bond0: fault-tolerance (active-backup), mtu 1500 
 eth0: 1000 Mb/s, full duplex, mtu 1500 
 eth1: 1000 Mb/s, full duplex, mtu 1500
Kernel: Linux 4.9.28-unRAID x86_64
unRAID version: 6.3.4
Video Cards: Zotac GeForce GTX 1050 and 1060
Array: 4x 3TB HDD
Cache: 250GB SSD in RAID 0 configuration (VMs reside here)
 
VMs: LibreElec for use as my HTPC and a Windows 10 system for gaming/workstation. I'm passing through my 1050 to the LibreElec VM and my 1060 to my Win 10 VM.
Home Theatre Gear: The unRAID server is connected via HDMI to my Marantz SR7010 AVR. This AVR outputs video via a 35 foot HDMI cable to my media room TV that's about 25 feet away from my server room. The cable is fine since I have a Bell PVR connected to the receiver and it has no issues. Audio is output via a bunch of speaker cables that run out to my 7.1 speaker system. Speaker wire gauge is 14 I believe so it's not an attenuation issue of any sort. As I previously mentioned, other devices aside from my unRAID server do not appear to have any issue.
 
I performed all of the performance tweaks I'm aware of with regards to optimizing the VMs; I've isolated CPUs from unRAID and pinned specific cores to the VMs. I've noticed VM latency is much improved but my issues are not resolved.
 
The following is my syslinux configuration file settings:
 
label unRAID OS
  menu default
  kernel /bzimage
  append isolcpus=3,4,5,6,7,11,12,13,14,15 vfio-pci.ids=1912:0014 initrd=/bzroot
 
This isolates all cores except for 0, 1 and 2 (HT pairs 0,8/1,9/2,10). The isolated cores are pinned like so:
 
Win 10 VM:
 
<vcpu placement='static'>8</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='4'/>
    <vcpupin vcpu='1' cpuset='5'/>
    <vcpupin vcpu='2' cpuset='6'/>
    <vcpupin vcpu='3' cpuset='7'/>
    <vcpupin vcpu='4' cpuset='12'/>
    <vcpupin vcpu='5' cpuset='13'/>
    <vcpupin vcpu='6' cpuset='14'/>
    <vcpupin vcpu='7' cpuset='15'/>
    <emulatorpin cpuset='0-1,8-9'/>
  </cputune>
 
LibreElec VM:
 
<vcpu placement='static'>2</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='3'/>
    <vcpupin vcpu='1' cpuset='11'/>
    <emulatorpin cpuset='0-1,8-9'/>
  </cputune>
 
The problem...
 
I believe the root cause of my issues is high DPC latency. I've followed all performance related steps and have cut down my latency quite significantly but I'm still having these issues with my passthrough VMs:
 
LibreElec: Pink/whitish pixelation, most noticable when TV is dimmed (idle state). Performance wise the VM seems to be OK but this pixelation can pop up during media playback. Source material doesn't seem to matter, it can be from an Exodus stream or from one of my MKV/M2TS Bluray files. Please refer to the attached image to get an idea of this pixelation I'm referring to. I've enabled MSI interrupts FYI. Also keep in mind this is a custom VM not the Limetech provided LibreElec template.
 
Win 10: Occassional video stutter plus audio pop/crackle. I've measured high DPC latency, of which has been significantly reduced since applying MSI enable to the 1060 and pinning CPUs. But while running LatencyMon, I still get DPC spikes up to 5000us from the Nvidia Display Device Driver (nvlddmkm.sys) and average us of 1500 with the 1060... I suspect if I could run LatencyMon on the 1050 via LibreElec it would be a similar result.
 
I'm not sure what else to try to resolve this problem. Have I hit a wall with regards to the hardware I'm using? Perhaps it's an issue with my Supermicro board? It seems like it should be fine, the only minor detail would be that the PCIE x16 slots are gen 2 as opposed to the new standard which is gen 3. I doubt this is the problem, gen 2 bandwidth should be sufficient for modern video cards still. I'm suspecting some sort of hardware level IRQ/resource issue though I don't know where to go from there. I'm also not sure what I can do with regards to the Nvidia GPUs to assist with this DPC latency issue.
 
Any tips or suggestions you guys can provide would be great!
 
Thanks for checking out my post.
 
 
 

Screenshot 2017-05-15 at 9.35.19 PM.png

Link to comment

Any thoughts guys? I think I've narrowed my LibreElec VM issue... doesn't appear to be latency related but moreso a refresh rate problem between my AVR and my plasma TV (older 1080p unit). I dropped the refresh rate down to 30fps in Kodi system settings and no more pixelation. If I connect my HDMI cable straight from my 1050 to the TV, no pixelation though. Anyway, I have a new 4k screen coming so I suspect I won't have this issue soon enough. This AVR is geared towards an 4k display so I'm hoping that takes care of this one. I'll consider this one resolved for the time being.

 

Regarding my Win 10 VM though, I still audio pop and random stutter. Any suggestions to fix that would be greatly appreciated.

 

Thanks again,


Chris

Link to comment

The snow you are seeing in the libreelec VM is a HDMI cable problem most likely. Replace it with a good quality cable and if you are using 4k resolution be sure it handles 18Gbps bandwidth. 

 

You should remove core 0 and maybe its thread from the emulator pin as unraid likes to use this one. I'm not sure how much CPU is needed for the emulator but try one core for each VM and it should be one that is isolated. I'm speaking about cores you can choose in the VM template and not physical cores.

  • Upvote 2
Link to comment

Thanks saarg for the response!

 

With regards to the cable, I was thinking that was the problem as well but I troubleshot that awhile back. If I connect the HDMI cable directly to my GTX 1050, I have no issues. If it's connected to my AVR which is then connected to my 1050, I see the issue. This is a newer/modern AVR which supports 4K and Dolby Atmos. I haven't had any issues with it until this... I'm going to wait and see if the issue is totally resolved once I get my new display installed. I suspect it's more TV end (refresh rate incompatibility) then anything else. Everything else connected to the AVR appears to work properly.

 

I made sure I bought a good in-wall HDMI cable, it's defined as an HDMI 2.0b cable and supports 18Gbps bandwidth.

 

I'll try your emulator pin suggestion to see if that helps.

 

Any thoughts as to why my Windows 10 VM would be experiencing high DPC latency? Specifically with the Nvidia display driver?

 

Thanks again,


Chris

Link to comment

I've changed the emulator pin to use pin 1 and also tried pin 9 but no change. I also tried the tips and tricks plugin and changed the CPU governor to performance. Still getting audio pop/crackle.

 

I've also tried upping the sound properties to use 24khz (studio) quality bit rate but that had no affect.

 

Bottom line is, I'm not sure why the DPC latency continously spikes to over 1500 for the Nvidia display driver. That's got to be the root cause behind this, not sure what else I can try at this point...

Link to comment

Did you isolate core 1 and 9 before testing?

Are you running the VMS at the same time. If you are, try running only one and see if it helps.

Did you verify that the MSI fix was applied?

Have you tried using ddu to remove graphics drivers and reinstall it?

Is it a clean win 10 install or upgraded?

Edited by saarg
Link to comment

Yes, I did isolate cores 1 and 9. My syslinux config file looks like this: append isolcpus=3,4,5,6,7,11,12,13,14,15 vfio-pci.ids=1912:0014 initrd=/bzroot

 

My Win 10 VM XML config looks like this:

 

<domain type='kvm'>
  <name>WS1</name>
  <uuid>777e0bd2-fb1c-4d3d-f34d-031dd62047af</uuid>
  <metadata>
    <vmtemplate xmlns="unraid" name="Windows 10" icon="windows.png" os="windows10"/>
  </metadata>
  <memory unit='KiB'>13107200</memory>
  <currentMemory unit='KiB'>13107200</currentMemory>
  <memoryBacking>
    <nosharepages/>
  </memoryBacking>
  <vcpu placement='static'>8</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='4'/>
    <vcpupin vcpu='1' cpuset='5'/>
    <vcpupin vcpu='2' cpuset='6'/>
    <vcpupin vcpu='3' cpuset='7'/>
    <vcpupin vcpu='4' cpuset='12'/>
    <vcpupin vcpu='5' cpuset='13'/>
    <vcpupin vcpu='6' cpuset='14'/>
    <vcpupin vcpu='7' cpuset='15'/>
    <emulatorpin cpuset='1'/>
  </cputune>
  <os>
    <type arch='x86_64' machine='pc-i440fx-2.7'>hvm</type>
  </os>
  <features>
    <acpi/>
    <apic/>
    <hyperv>
      <relaxed state='on'/>
      <vapic state='on'/>
      <spinlocks state='on' retries='8191'/>
      <vendor_id state='on' value='none'/>
    </hyperv>
  </features>
  <cpu mode='host-passthrough'>
    <topology sockets='1' cores='4' threads='2'/>
  </cpu>
  <clock offset='localtime'>
    <timer name='hypervclock' present='yes'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/local/sbin/qemu</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='writeback'/>
      <source file='/mnt/cache/VMs - SSD/WS1/vdisk1.img'/>
      <target dev='hdc' bus='virtio'/>
      <boot order='1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </disk>
    <controller type='usb' index='0' model='nec-xhci'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'/>
    <controller type='virtio-serial' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </controller>
    <interface type='bridge'>
      <mac address='52:54:00:11:97:7b'/>
      <source bridge='br0'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </interface>
    <serial type='pty'>
      <target port='0'/>
    </serial>
    <console type='pty'>
      <target type='serial' port='0'/>
    </console>
    <channel type='unix'>
      <target type='virtio' name='org.qemu.guest_agent.0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <hostdev mode='subsystem' type='pci' managed='yes' xvga='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x06' slot='0x00' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x06' slot='0x00' function='0x1'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x08' slot='0x00' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </hostdev>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </memballoon>
  </devices>
</domain>

 

Not sure if you can see anything that stands out there.

 

The audio popping/high DPC latency is occurring regardless if only the Win 10 VM is on or all of my other VMs.

 

Yes, I made sure the MSI fix was applied by using that MSI util app I found somewhere in this forum.

 

I haven't tried DDU to reinstall the graphics card drivers. I can do this shortly.

 

This was a clean Win 10 install, the VM is only a month or so old.


Thanks again

Edited by swiguy
Link to comment

I went with seabios since I was having issues getting the VM to display properly out via OVMF. I actually tried creating a new Win 10 VM using OVMF (primary display device was VNC, the 1060 booted as the secondary), I had the exact same sound issue.

 

Should there be no real difference between the two BIOS versions?

Link to comment

I'll give that a try... I'm just starting to feel like my hardware combo isn't compatible in some way or another. I would wager a guess that if I took my components out from my Supermicro motherboard and installed them into another, maybe slightly newer motherboard I wouldn't see this issue.

 

Maybe I can use this as an excuse to start my Ryzen build....

Link to comment
  • 11 months later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.