Call Traces error found by Fix Common Problems


Marv

Recommended Posts

Hi Everyone,

 

I just scanned my system with Fix Common Problems for the first time today since upgrading to v6.3.1.

Unfortunately I get the following error reported by the plugin:

"Call Traces found on your server" -  "Your server has issued one or more call traces. This could be caused by a Kernel Issue, Bad Memory, etc. You should post your diagnostics and ask for assistance on the unRaid forums."

 

I didn't have any problems with my server yet and don't really know where to start here.

 

So here are my diagnostics. Hopefully someone can take a look at this.

 

unmarv-diagnostics-20170212-1355.zip

Link to comment

Since nobody has responded I'll have a go. The two call traces seem to be due to a page allocation failure, possibly related to KVM. The first thing I'd do is select Memtest from the boot menu and run it for a good long time to make sure your RAM is ok because there's no point in continuing if it's bad. Then I'd try running with no VMs running and the VM service disabled in Settings. Your libvirt log has entries saying that your libreELEC VM is running tainted code so best to make sure the system works well as a basic NAS before re-enabling the complicated stuff.

 

Link to comment

Your libvirt log has entries saying that your libreELEC VM is running tainted code so best to make sure the system works well as a basic NAS before re-enabling the complicated stuff.

Just FYI since that "tainted" message implies that there is actually something wrong here

 

....is tainted: High-Privileges means that QEMU is running as root: perfectly normal under unRaid

....is tainted: Host-CPU means the QEMU is passing through the CPU instead of emulating it.  Once again perfectly normal under unRaid

 

But, everything you suggest is 100% valid.

Link to comment

Since nobody has responded I'll have a go. The two call traces seem to be due to a page allocation failure, possibly related to KVM. The first thing I'd do is select Memtest from the boot menu and run it for a good long time to make sure your RAM is ok because there's no point in continuing if it's bad. Then I'd try running with no VMs running and the VM service disabled in Settings. Your libvirt log has entries saying that your libreELEC VM is running tainted code so best to make sure the system works well as a basic NAS before re-enabling the complicated stuff.

 

Thanks for your reply.

I'll run the Memtest later today and report back.

Link to comment

So I ran the Memtest for 23 hours finding 0 errors.

I also didn't get the reported call traces error anymore.

 

Should I just keep the server running as normal and check the log from time to time and hope it won't come back or is there something else I can do now?

Link to comment

The VM Service was turned on and my LibreELEC VM was running aswell for most of the time.

 

What I noticed is that it seems that the LibreELEC VM uses slightly more RAM since upgrading to v6.3.0/1

I assigned 1024MB to the VM and after some time 90-95% is used.

Before upgrading the peak was most of the time around 80%.

 

Could it be possible that the error occured because the VM was running out of memory maybe?

Link to comment

Certainly, if the VM runs out of memory it will be a problem but I don't know if it will be the same problem as you've been experiencing. However your problem is one of memory allocation so they could indeed by connected. There's an easy way to find out - see how it behaves now that you've allocated more memory.

 

Link to comment

Thanks for your Support John.

 

So I've been running with the VM Service disabled and got no errors since.

But I just remembered that the first time when the 'call trace' was issued I had lags while playing a movie with Kodi.

The second time the error came up was also while playing music.

 

So I'll just leave VMs disabled for now and see how it goes.

What do you suggest when I reenable it and the error comes up again while using the LibreELEC vm?

 

Link to comment

If that turns out to be the case then the problem is likely to be VM related. I would give the other VM - SteamOS - a good testing to see if it's affected too. If not then it fairly conclusively points to LibreELEC. The important thing at this stage it to find out if the server has any issues when running as a basic NAS.

Link to comment

The error didn't come up yet.

I also enabled the VM service again but without using LibreELEC (SteamOS isn't installed atm).

I won't have time playing around with LibreELEC before sunday so I'll just let the server run like this and see what happens when using Kodi again.

 

Link to comment
13 minutes ago, Marv said:

I'm back again :(

 

Only good thing is I'm 99% sure it only happens when my LibreELEC VM is running.

 

Anything else I can do here?

unmarv-diagnostics-20170228-2226.zip

Can you post the output of this command: 

cat /proc/interrupts

The call trace (caused by IRQ 16 being disabled) may be benign in your case, but we can't 100% tell unless we see which modules are utilizing the interrupt

Link to comment

I guess I need to use the command when the error occured, right?

Cause I powered down the server after posting my diagnostics.

Have to wait for another 'call traces' then.

 

Is it possible btw that the error has something to do with my Logitech keyboard passed through via USB3.0 to my VM?

Edited by Marv
Link to comment
I guess I need to use the command when the error occured, right?
Cause I powered down the server after posting my diagnostics.
Have to wait for another 'call traces' then.
 
Is it possible btw that the error has something to do with my Logitech keyboard passed through via USB3.0 to my VM?

The command will still work after the fact

Sent from my LG-D852 using Tapatalk

Link to comment

Here you go

 

           CPU0       CPU1       CPU2       CPU3
  0:         28          0          0          0  IR-IO-APIC   2-edge      timer
  1:          2          0          0          0  IR-IO-APIC   1-edge      i8042
  8:          5          0          1          0  IR-IO-APIC   8-edge      rtc0
  9:          0          0          0          0  IR-IO-APIC   9-fasteoi   acpi
 12:          4          0          0          0  IR-IO-APIC  12-edge      i8042
 16:         19          5          0         15  IR-IO-APIC  16-fasteoi   ehci_                                                                                                                                                                             hcd:usb1
 18:          0          0          0          0  IR-IO-APIC  18-fasteoi   i801_                                                                                                                                                                             smbus
 23:         26          5          0         13  IR-IO-APIC  23-fasteoi   ehci_                                                                                                                                                                             hcd:usb2
 24:          0          0          0          0  DMAR-MSI   0-edge      dmar0
 25:          0          0          0          0  DMAR-MSI   1-edge      dmar1
 27:       3354       1209       1072        643  IR-PCI-MSI 327680-edge      xh                                                                                                                                                                             ci_hcd
 28:      13695       5714       2984       1956  IR-PCI-MSI 512000-edge      ah                                                                                                                                                                             ci[0000:00:1f.2]
 29:        681         98         84         67  IR-PCI-MSI 1572864-edge      a                                                                                                                                                                             hci[0000:03:00.0]
 30:       5145       1417        769        657  IR-PCI-MSI 409600-edge      et                                                                                                                                                                             h0
 31:       1437        334         95        142  IR-PCI-MSI 442368-edge      vf                                                                                                                                                                             io-msi[0](0000:00:1b.0)
 32:       3232        268        158        368  IR-PCI-MSI 524288-edge      vf                                                                                                                                                                             io-msi[0](0000:01:00.0)
NMI:          0          0          0         19   Non-maskable interrupts
LOC:     106321      72012      69739     127602   Local timer interrupts
SPU:          0          0          0          0   Spurious interrupts
PMI:          0          0          0         19   Performance monitoring interr                                                                                                                                                                             upts
IWI:          0          0          0          0   IRQ work interrupts
RTR:          3          0          0          0   APIC ICR read retries
RES:      12242      10457       8108      12691   Rescheduling interrupts
CAL:       3000       3157       2693       2556   Function call interrupts
TLB:       2600       2698       2124       2128   TLB shootdowns
TRM:          0          0          0          0   Thermal event interrupts
THR:          0          0          0          0   Threshold APIC interrupts
DFR:          0          0          0          0   Deferred Error APIC interrupt                                                                                                                                                                             s
MCE:          0          0          0          0   Machine check exceptions
MCP:          2          2          2          2   Machine check polls
ERR:          0
MIS:          0
PIN:          0          0          0          0   Posted-interrupt notification                                                                                                                                                                              event
PIW:          0          0          0          0   Posted-interrupt wakeup event

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.