Hoopster Posted January 12, 2012 Share Posted January 12, 2012 See my last post in this thread on page 5 for why I consider this issue solved. I have a 4-port x4 PCIe SATA card inserted in the PCIe x16 slot of my mini-ITX MB. Preclears on any drive attached to this controller are reading at about 2.6 MB/s. The MB is a Biostar TH61-ITX On the MB SATA ports, the pre-read runs at over 100 MB/s The PCIe SATA controller is based on the Marvel 88SX7042 chipset. I have tried both the SYBA SY-PEX40048 card and the Rosewill RC-218. They are absolutely identical in every detail and both yield the same results. Both are reported by numerous people as working with unRAID through the sata_mv driver in Linux kernel 2.6 and above. As it is now, I am limited to a 4-drive array as the MB has only 4 SATA ports. I need the 4 more the controller card offers, but, not at this performance level. All drives Including those attached to the MB ports are showing up as UDMA/133. I thought that was an IDE standard. All drives are SATA and the MB BIOS is set to AHCI. In the syslog, the drive ST2000DL003-9VT166_6YD1Q84W (sda) is the one attached to the PCIe controller. The other three are attached to MB SATA ports. I note in the syslog (near the end) that IRQ #16 is being disabled and it appears to have something to do with the SATA drives. Running unRAID v5b14 Syslog attached. Any idea? Is my MB PCIe slot at fault. Syslog_-_Jan_12_13.doc Quote Link to comment
dgaschk Posted January 12, 2012 Share Posted January 12, 2012 Search for irqpoll in this forum. Quote Link to comment
Hoopster Posted January 13, 2012 Author Share Posted January 13, 2012 No joy! I tried adding both irqpoll and irqfixup to the syslinux.cfg by modifying the line "append initrd=bzroot rootdelay=10 irqpoll" to try each option. The only difference is abysmal boot times (dare I say, almost Windows like?) and a noticeable performance lag. Right before the login prompt appears is says disabling IRQ #16 regardless of what I add to syslinux.cfg Does my syslog indicate anything else I might try? Quote Link to comment
prostuff1 Posted January 13, 2012 Share Posted January 13, 2012 No joy! I tried adding both irqpoll and irqfixup to the syslinux.cfg by modifying the line "append initrd=bzroot rootdelay=10 irqpoll" to try each option. The only difference is abysmal boot times (dare I say, almost Windows like?) and a noticeable performance lag. Right before the login prompt appears is says disabling IRQ #16 regardless of what I add to syslinux.cfg Does my syslog indicate anything else I might try? See if you can find an update to the BIOS on our board. This is ultimately the fault of the BIOS for not handeling thr IRQ's quite right. Quote Link to comment
Hoopster Posted January 13, 2012 Author Share Posted January 13, 2012 The BIOS on the MB was released June 1, 2011. There is a new BIOS for the MB released December 5, 2011; however, it lists as the only change support for 22nm CPUs. The only method for updating the BIOS seems to be a Windows utility. I guess I'll have to install windows on one of my spare disks and boot to that to install a new BIOS although I am not sure this new BIOS has any IRQ-related fixes. Would I be better off to swap the motherboard for an Intel DQ67EP or DH67CF? Quote Link to comment
Hoopster Posted January 13, 2012 Author Share Posted January 13, 2012 Well, the "disabling IRQ #16" issue is definitely the root of the problem. After reading through every post on the forums I could find about IRQ errors, I noticed that one user has set the PCI ROM option in his BIOS to "EFI compatible." Mine had that option plus "legacy." It was set to "legacy." Setting it to "EFI compatible" helped for a couple of minutes. The "disabling IRQ #16" message did not appear right before the login prompt when unRAID booted. Through a telnet session, I was able to start a preclear on the disk attached to the PCIe SATA controller and it was prereading at >100 MB/s. This is the first time I had ever seen that! I was doing a happy dance!. My joy was short-lived however. I terminated the telnet session and switched over to the command prompt on the unRAID box (I have a keyboard attached to it and an input on my desktop monitor so I can switch with a button press) only to be greeted by the "disabling IRQ #16" message. Preclear pre-reads are now back to 2.6 MB/s. Quote Link to comment
Hoopster Posted January 13, 2012 Author Share Posted January 13, 2012 No joy! I tried adding both irqpoll and irqfixup to the syslinux.cfg by modifying the line "append initrd=bzroot rootdelay=10 irqpoll" to try each option. The only difference is abysmal boot times (dare I say, almost Windows like?) and a noticeable performance lag. Right before the login prompt appears is says disabling IRQ #16 regardless of what I add to syslinux.cfg Does my syslog indicate anything else I might try? See if you can find an update to the BIOS on our board. This is ultimately the fault of the BIOS for not handeling thr IRQ's quite right. Well, I installed Window 7 on one of the disks and downloaded and installed the latest BIOS for the Biostar TH61-ITX MB. I set anything in the BIOS I thought might make a difference and rebooted. Although IRQ #16 was not immediately disabled it was eventually disabled after 3-4 minutes. I noticed in the syslog that IRQ #16 seems to be assigned to the NIC as well as all four ports on the PCIe SATA controller (if I am reading it correctly). When IRQ #16 is disabled, I still have network connectivity, but, performance on the SATA drives on the controller goes in the tank as noted in preclear attempts. Time to give up on this MB? Quote Link to comment
prostuff1 Posted January 13, 2012 Share Posted January 13, 2012 or contact Biostar for a new BIOS and see if they will send you one that is not on there website. Quote Link to comment
Hoopster Posted January 13, 2012 Author Share Posted January 13, 2012 or contact Biostar for a new BIOS and see if they will send you one that is not on there website. I think Hades may freeze over before I ever get anything from Biostar. Unfortunately, I have tried to contact them three different ways, and they have not responded to any of my inquiries from more than a week ago. I'll try it again, but, I am not hopeful. I may have to go with one of the Intel motherboards. At least they have decent support and you can actually speak with someone there, not just fill out an "E Support" form that never gets answered or send an email that ends up in the "do not respond" bin. Biostar "support" is nonexistent. Quote Link to comment
Hoopster Posted January 13, 2012 Author Share Posted January 13, 2012 This issue is really bothering me. Just for fun, I booted into Windows, installed the driver for the PCIe SATA controller and updated it to the latest available version I could find (3.6.3). The controller installed with no problems. All disks attached to the controller appeared in device manager (unRAID recognizes them as well). The same IRQ (#16) was assigned to it by the BIOS as in unRAID. I checked all devices in device manager and there were no IRQ conflicts in Windows and all devices were reported as working properly. I waited 10 minutes and checked them all again. Still no problems. I shut down Windows and booted into unRAID, six minutes after booting, I got this is the syslog (with the familiar "disabling IRQ #16" on the monitor"): Jan 13 16:38:22 MediaNAS kernel: irq 16: nobody cared (try booting with the "irqpoll" option) (Errors) Jan 13 16:38:22 MediaNAS kernel: Pid: 0, comm: swapper Not tainted 3.1.1-unRAID #1 (Errors) Jan 13 16:38:22 MediaNAS kernel: Call Trace: (Errors) Jan 13 16:38:22 MediaNAS kernel: [<c104fa8c>] __report_bad_irq+0x1f/0x95 (Errors) Jan 13 16:38:22 MediaNAS kernel: [<c104fc39>] note_interrupt+0x137/0x1a8 (Errors) Jan 13 16:38:22 MediaNAS kernel: [<c104e776>] handle_irq_event_percpu+0xef/0x100 (Errors) Jan 13 16:38:22 MediaNAS kernel: [<c1050152>] ? handle_edge_irq+0xcb/0xcb (Errors) Jan 13 16:38:22 MediaNAS kernel: [<c104e7ab>] handle_irq_event+0x24/0x3b (Errors) Jan 13 16:38:22 MediaNAS kernel: [<c1050152>] ? handle_edge_irq+0xcb/0xcb (Errors) Jan 13 16:38:22 MediaNAS kernel: [<c10501bb>] handle_fasteoi_irq+0x69/0x82 (Errors) Jan 13 16:38:22 MediaNAS kernel: <IRQ> [<c1003566>] ? do_IRQ+0x37/0x90 Jan 13 16:38:22 MediaNAS kernel: [<c130c669>] ? common_interrupt+0x29/0x30 (Errors) Jan 13 16:38:22 MediaNAS kernel: [<c11ddd89>] ? acpi_idle_enter_bm+0x22a/0x25e (Errors) Jan 13 16:38:22 MediaNAS kernel: [<c12734f0>] ? cpuidle_idle_call+0x75/0xbd (Errors) Jan 13 16:38:22 MediaNAS kernel: [<c1001a5f>] ? cpu_idle+0x39/0x5a (Errors) Jan 13 16:38:22 MediaNAS kernel: [<c12fbd40>] ? rest_init+0x58/0x5a (Errors) Jan 13 16:38:22 MediaNAS kernel: [<c145172d>] ? start_kernel+0x28c/0x291 (Errors) Jan 13 16:38:22 MediaNAS kernel: [<c14510b0>] ? i386_start_kernel+0xb0/0xb7 (Errors) Jan 13 16:38:22 MediaNAS kernel: handlers: Jan 13 16:38:22 MediaNAS kernel: [<c1243aca>] usb_hcd_irq (Drive related) Jan 13 16:38:22 MediaNAS kernel: [<f848bf28>] mv_interrupt Jan 13 16:38:22 MediaNAS kernel: Disabling IRQ #16 Jan 13 16:41:42 MediaNAS kernel: NTFS driver 2.1.30 [Flags: R/W MODULE]. (System) I am by no means an expert on Linux/UnRAID, but, as I read through the forums I see that several of us have the same problem, all with different MBs and different SATA controllers with different chipsets. We also all seem to be running unRAID 5 beta. Again, I don't know enough to be making any declarations, but, is there any way this is an unRAID/Linux kernel issue with certain hardware combinations or do those of us with the problems all just happen to have different combinations of MB BIOS and SATA chipsets that throw these errors? Quote Link to comment
WeeboTech Posted January 14, 2012 Share Posted January 14, 2012 Just for curiosity, do a cat /proc/interrupts post the results. Quote Link to comment
Hoopster Posted January 14, 2012 Author Share Posted January 14, 2012 Results of cat /proc/interrupts attached. Interrupts.txt Quote Link to comment
Hoopster Posted January 14, 2012 Author Share Posted January 14, 2012 Hmmm...it looks like usb1 and sata_mv are sharing IRQ #16. Is that a problem? Quote Link to comment
WeeboTech Posted January 14, 2012 Share Posted January 14, 2012 And here's mine for comparision. Notice how the interrupts are spread. I wonder if something in your bios is not enabled. Take a peek at the APIC options... Toggle them in the bios. I can't tell you this is the answer, but it may lead towards one. root@atlas ~ #cat /proc/interrupts CPU0 CPU1 0: 30 0 IO-APIC-edge timer 1: 1 1 IO-APIC-edge i8042 9: 0 0 IO-APIC-fasteoi acpi 12: 1 2 IO-APIC-edge i8042 16: 13188396 13209542 IO-APIC-fasteoi uhci_hcd:usb3, arcmsr 17: 0 0 IO-APIC-fasteoi uhci_hcd:usb4 18: 187 187 IO-APIC-fasteoi ehci_hcd:usb1, uhci_hcd:usb5, uhci_hcd:usb8 22: 0 0 IO-APIC-fasteoi uhci_hcd:usb7 23: 10379 10343 IO-APIC-fasteoi ehci_hcd:usb2, uhci_hcd:usb6 24: 0 0 IO-APIC-fasteoi sata_sil24 48: 9947472 9945937 IO-APIC-fasteoi sata_mv 52: 9011659 9007260 IO-APIC-fasteoi sata_mv 77: 210992701 210972655 PCI-MSI-edge eth0 79: 22357730 22368824 PCI-MSI-edge ahci NMI: 0 0 Non-maskable interrupts LOC: 310342449 310344342 Local timer interrupts SPU: 0 0 Spurious interrupts PMI: 0 0 Performance monitoring interrupts PND: 0 0 Performance pending work RES: 2468511 14446408 Rescheduling interrupts CAL: 384 639 Function call interrupts TLB: 2473632 2311483 TLB shootdowns TRM: 0 0 Thermal event interrupts Quote Link to comment
Hoopster Posted January 14, 2012 Author Share Posted January 14, 2012 OK, I just discovered something interesting. If I start a preclear from the command prompt after booting unRAID without accessing the unRAID server from a browser, pre-reads run at over 100 MB/s and keep running with no problems. I had it running successfully for over two hours and IRQ #16 is NEVER disabled. I also have unMenu and unRAID Web installed. I went to my windows machine and started unMenu (medianas:8080), immediately, I got the disabling IRQ #16 message and the pre-read performance tanked to 2.6 MB/s Although I had considered it to be a separate issue to be resolved after clearing up this IRQ mess, any time I started unRAID web (medianas:89), I would get a whole bunch of PHP undefined variable/constant messages on the unRAID console. Shortly thereafter, the "disabling IRQ #16" message would appear (but, it may only be when I access unMenu via unRAID web), but, I never associated the two together until this morning's test. Apparently, the disabling of IRQ #16 occurs when I access the unRAID box via unMenu (or maybe, any web service). What I can I look for to correct? Quote Link to comment
WeeboTech Posted January 14, 2012 Share Posted January 14, 2012 Reads like there is an incompatibility with the network driver. Did you look into any bios changes for the APIC ? Seems all interrupts are serviced by one core. Does it all work with an earlier version of unRAID in the 5.x series? Quote Link to comment
Joe L. Posted January 14, 2012 Share Posted January 14, 2012 OK, I just discovered something interesting. If I start a preclear from the command prompt after booting unRAID without accessing the unRAID server from a browser, pre-reads run at over 100 MB/s and keep running with no problems. I had it running successfully for over two hours and IRQ #16 is NEVER disabled. I also have unMenu and unRAID Web installed. I went to my windows machine and started unMenu (medianas:8080), immediately, I got the disabling IRQ #16 message and the pre-read performance tanked to 2.6 MB/s Although I had considered it to be a separate issue to be resolved after clearing up this IRQ mess, any time I started unRAID web (medianas:89), I would get a whole bunch of PHP undefined variable/constant messages on the unRAID console. Shortly thereafter, the "disabling IRQ #16" message would appear (but, it may only be when I access unMenu via unRAID web), but, I never associated the two together until this morning's test. Apparently, the disabling of IRQ #16 occurs when I access the unRAID box via unMenu (or maybe, any web service). What I can I look for to correct? Or maybe, any network activity over a certain level. Web-services do not equate to interrupts. (other than the network card issues them. Perhaps if too many, too quickly from your NIC to the chipset on the MB, it loses an interrupt.) If you've already tried a different NIC, I'd look for a different motherboard. Joe L. Quote Link to comment
Hoopster Posted January 14, 2012 Author Share Posted January 14, 2012 I haven't tried any early beta versions, but, I do need to be running v5 beta because my NIC is the dreaded Realtek 8111E. Installing a PCIe NIC is not an option since I need the one and only PCIe slot for the SATA controller. I will check out the BIOS for any APIC-related options. Do you have a suggestion for which earlier beta to try? Do I need to do anything special when moving from one v5 beta version to another? Quote Link to comment
Hoopster Posted January 14, 2012 Author Share Posted January 14, 2012 OK, I just discovered something interesting. If I start a preclear from the command prompt after booting unRAID without accessing the unRAID server from a browser, pre-reads run at over 100 MB/s and keep running with no problems. I had it running successfully for over two hours and IRQ #16 is NEVER disabled. I also have unMenu and unRAID Web installed. I went to my windows machine and started unMenu (medianas:8080), immediately, I got the disabling IRQ #16 message and the pre-read performance tanked to 2.6 MB/s Although I had considered it to be a separate issue to be resolved after clearing up this IRQ mess, any time I started unRAID web (medianas:89), I would get a whole bunch of PHP undefined variable/constant messages on the unRAID console. Shortly thereafter, the "disabling IRQ #16" message would appear (but, it may only be when I access unMenu via unRAID web), but, I never associated the two together until this morning's test. Apparently, the disabling of IRQ #16 occurs when I access the unRAID box via unMenu (or maybe, any web service). What I can I look for to correct? Or maybe, any network activity over a certain level. Web-services do not equate to interrupts. (other than the network card issues them. Perhaps if too many, too quickly from your NIC to the chipset on the MB, it loses an interrupt.) If you've already tried a different NIC, I'd look for a different motherboard. Joe L. I am limited on motherboard selection because only a mini-ITX MB fits in my Lian-Li case. I have looked at the Intel DQ67EP as a possible replacement. It has the Intel 82579LM gigabit NIC, but, I understand a v5 beta version is also required to support this NIC as well, correct? Quote Link to comment
WeeboTech Posted January 14, 2012 Share Posted January 14, 2012 I really dislike the realtek nics. I have a mini itx CMB-673 and whenever the torrent machine gets really busy, the network goes off line. If it were my machine, I would experiment with the APIC options in bios or go with the intel motherboard. I might also consider the Supermicro X7SPA-HF or X7SPA-HF-D525, but that's me. For my file server, I like reliable, tried and true. I don't have any recommendations on which beta because I'm not running 5.x on any machines. (can ya believe it) Quote Link to comment
Hoopster Posted January 14, 2012 Author Share Posted January 14, 2012 Before I completely abandon the Biostar MB (although that is likely) and go with the Intel DQ67EP or Supermicro X9SCV-Q or QV4 MB (I need a socket 1155 MB for my i3 2100 CPU), here is a map of any BIOS settings I think I may be able to tweak. There is nothing in the BIOS that is APIC specific and I don't know enough enough about many of these to know if a different setting would help or not. The current setting is in [] Any suggestions? Advanced Settings PCI Subsystem PCI ROM Priority [EFI Compatible] Legacy Compatible PCI Latency Timer [32 Bus Clocks] Various settings between 32 - 248 Clocks VGA Palette Snoop [Disabled] Enabled No Snoop [Enabled] Disabled Max Payload [Auto] Various settings from 128 - 4096 Bytes Max Read Requests [Auto] Various Settings from 128 - 4096 Bytes PCI-E Link Settings ASPM Suport [Auto] Enabled Disabled ACPI Settings EuP Support [Enabled] Disabled ACPI Sleep [s3] S1 Disabled Lock Legacy Resources [Disabled] Enabled CPU Config SATA Config SATA Mode [AHCI] IDE Disabled S.M.A.R.T. [Enabled] Disabled Aggressive Link Power Management [Enabled] Disabled USB Config Legacy USB Support [Enabled] Disabled USB 3.0 [Enabled] Disabled XHCI Handoff [Enabled] Disabled EHCI Handoff [Enabled] Disabled Mass Storage Lists USB drives Smart Fan Super IO H/W Monitor Chipset North Bridge A bunch of onboard graphics settings PCI-E Port [Enabled] Disabled Detect Non-Compliant Device [Enabled] Disabled PEX16_1 Gen X [Auto] Gen 1 (The PCIe SATA Controller is Gen 1, but, Auto should cover it) Gen 2 South Bridge Azalia HD Audio [Disabled] Enabled High-Precision Timer [Enabled] Disabled USB 1.0/2.0 [Enabled] Disabled EHCI 1 [Enabled] Disabled EHCI 2 [Enabled] Disabled Onboard PCI-E Launch PXE opROM [Disabled] Enabled Lauch Storage opROM [Disabled] Enabled Onboard PCI-E Giga LAN [Enabled] Disabled Rear USB 3.0 [Enabled] Disabled Front USB 3.0 [Disabled] Enabled Boot Gate A20 Active [upon Request] Enabled Disabled Option ROM Messages [Force BIOS] Keep Current INT 19 Capture [Disabled] Enabled UEFI Boot [Disabled] Enabled A bunch of boot priority settings Quote Link to comment
dgaschk Posted January 15, 2012 Share Posted January 15, 2012 Are the drives set to AHCI? Is APIC a typo? Quote Link to comment
WeeboTech Posted January 15, 2012 Share Posted January 15, 2012 Are the drives set to AHCI? Is APIC a typo? According to the capture AHCI is enabled SATA Config SATA Mode [AHCI] IDE Disabled APIC is not a typo, Advanced Programmable Interrupt Controller. http://en.wikipedia.org/wiki/Advanced_Programmable_Interrupt_Controller Quote Link to comment
WeeboTech Posted January 15, 2012 Share Posted January 15, 2012 Jan 12 13:46:26 MediaNAS logger: /etc/rc.d/rc.inet1: /sbin/route add -net 127.0.0.0 netmask 255.0.0.0 lo (Network) Jan 12 13:46:26 MediaNAS logger: /etc/rc.d/rc.inet1: /sbin/ifconfig eth0 192.168.0.191 broadcast 192.168.0.255 netmask 255.255.255.0 (Network) Jan 12 13:46:26 MediaNAS kernel: r8169 0000:02:00.0: eth0: unable to load firmware patch rtl_nic/rtl8168e-3.fw (-2) (Network) Jan 12 13:46:26 MediaNAS kernel: r8169 0000:02:00.0: eth0: link down (Network) Jan 12 13:46:26 MediaNAS kernel: r8169 0000:02:00.0: eth0: link down (Network) Jan 12 13:46:26 MediaNAS logger: /etc/rc.d/rc.inet1: /sbin/route add default gw 192.168.0.1 metric 1 (Network) Is this normal for a realtek on this motherboard? Quote Link to comment
WeeboTech Posted January 15, 2012 Share Posted January 15, 2012 No joy! I tried adding both irqpoll and irqfixup to the syslinux.cfg by modifying the line "append initrd=bzroot rootdelay=10 irqpoll" to try each option. The only difference is abysmal boot times (dare I say, almost Windows like?) and a noticeable performance lag. Right before the login prompt appears is says disabling IRQ #16 regardless of what I add to syslinux.cfg Does my syslog indicate anything else I might try? Try any and/or all of these options. Do a search on the net to learn how they work. I think I had to use them on one of my older motherboards. along with iqpoll. acpi=off noapic nolapic Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.