Unraid 6.3.5 RX 480 Passthrough Crashes Unraid OS


snowmirage

Recommended Posts

I'm hoping someone can give me some ideas on what else I can try here.

I'm running an EVGA SR-2 Motherboard and attempting to passthrough an AMD RX 480 to a windows 10 vm, I'm also trying to pass a NVIDIA GTX 980 to another windows 10 vm.

Passing the 980 is working, to do so I needed to use the Q35.27 machine type and seabios.

Trying to pass the RX 480 even when in the same slot causes not only the VM to not boot at all but the entire unraid OS crashes, can't access unraid via the web GUI, ssh, or even get the direct console to respond to keyboard input requiring me to hard reset the entire system.

I was able to  tail -f /var/log/syslog and saw these messages when it crashes (instantly as soon as I power on the VM when passing the RX 480 through).



I had to take a picture of these with my cell phone then type them out here as syslog appears to clear it self when the system reboots.

Tower kernel: pcieport 0000:00:07.0: can't find device of ID0000
Tower kernel: DMAR: DRHD: handling fault status reg 100
Tower kernel: pcieport 0000:00:07.0: AER: Uncorrected (Fatal) error received: id=0000

The above messages repeat at least 6 or more times.

I'll attach the diag from the system as well.

What I have tried

Passing through both the RX 480 and its audio device
Passing through only the RX 480 without its audio device
Making sure both the RX 480 and its audio device are the only devices in its IOMMU group (I do not have ACS enabled)
Tried using the same slot as the GTX 980 that did pass through successfully

I don't know what device is being referenced in that crash I feel like there should be other errors that I"m not able to see.  But that isn't the RX 480 that I'm passing through to the VM.

I feel like I'm very close to having this working.  I tried previously with another motherboard and then it was super easy to pass through any AMD card I wanted, but any nvidia card was a nightmare now on this board that has flipped and only Nvidia is playing nice.

Any advice would be greatly appreciated.

tower-diagnostics-20170919-1419.zip

Link to comment

Looks like it is a root port.  (middle one below)
 

IOMMU group 3
	[8086:340a] 00:03.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 3 (rev 13)
IOMMU group 4
	[8086:340e] 00:07.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 7 (rev 13)
IOMMU group 5
	[8086:342d] 00:13.0 PIC: Intel Corporation 7500/5520/5500/X58 I/O Hub I/OxAPIC Interrupt Controller (rev 13)


But its in its own group with nothing else?


I'll give ACS a try and see if that changes anything

 

Link to comment

I remembered that at some point in attempting to get the GTX 980 to work I added this to the syslinux config

 

 vfio_iommu_type1.allow_unsafe_interrupts=1 




I disabled that and now when trying to start a VM with the RX 480 attached I'm getting this.  But the system didn't crash.
 

Sep 19 15:36:44 Tower kernel: vgaarb: device changed decodes: PCI:0000:08:00.0,olddecodes=io+mem,decodes=io+mem:owns=none
Sep 19 15:36:44 Tower kernel: br0: port 2(vnet0) entered blocking state
Sep 19 15:36:44 Tower kernel: br0: port 2(vnet0) entered disabled state
Sep 19 15:36:44 Tower kernel: device vnet0 entered promiscuous mode
Sep 19 15:36:44 Tower kernel: br0: port 2(vnet0) entered blocking state
Sep 19 15:36:44 Tower kernel: br0: port 2(vnet0) entered forwarding state
Sep 19 15:36:44 Tower kernel: vfio_iommu_type1_attach_group: No interrupt remapping support.  Use the module param "allow_unsafe_interrupts" to enable VFIO IOMMU support on this platform
Sep 19 15:36:44 Tower kernel: br0: port 2(vnet0) entered disabled state
Sep 19 15:36:44 Tower kernel: device vnet0 left promiscuous mode
Sep 19 15:36:44 Tower kernel: br0: port 2(vnet0) entered disabled state
Sep 19 15:36:45 Tower kernel: vgaarb: device changed decodes: PCI:0000:08:00.0,olddecodes=io+mem,decodes=io+mem:owns=none


Doesn't make sense to me that its able to pass through a LSI HBA card and a GTX 980 but trying a RX 480 crashes the system .... hmmmmmm

Link to comment

Thanks I suspect I've read through those same hits

Most of what I was able to find before my initial post seemed to point to either...

A few bugs from back in 2015 which I suspect likely managed to trickle their was down in to the updated versions of unraid since then,

Or bad implementations / bugs in motherboard hardware / bios in which case I'm a bit SOL as the latest bios release from EVGA for this board was years ago now.

Thankfully It seems that the 980 I'm passing through is working great,  I bit the bullet last night and managed to snag a 2nd one for just over $200 on ebay.  

Was really hoping to get that RX 480 working in this crazy build but I guess thats the price I have to pay using older hardware even if it is one of the most interesting motherboards ever built.

thanks for the help

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.