New Hardware w/ 5 Parity Check Error's


Recommended Posts

Just rebuilt unRAID on some new hardware. (In the next few days, I will do a full hardware review because it was an interesting project). The hardware is:

 

  • Supermicro 4U Server
    • CSE-846A-R1200B Chassis
    • X9DRI-F Motherboard
    • 2x E5-2670 2.6ghz 8-Core 8.0 GT/s / 20mb Smart Cache CPUs
    • 8x 8gb PC3-10600R Server Memory
    • 24x 3.5" Trays
    • SAS2-846EL1 Backplane
    • LSI 9266-8i 1mb Cache with Battery Back (I got mine working with unRAID)
    • 2x 1200w PSU
    • 11 Array Drives:
      • 2 Parity (4GB)
      • 9 Data
    • 500GB SSD Cache
    • 500GB Unassigned Drive

 

After I got everything up and running, I did a non-correcting parity check. Speeds were fine, but I did get 5 errors which slightly concerned me. Before I moved everything, I did a parity check with my old hardware and everything was fine, so I'm not too concerned, but I figured I would ask the community.

 

Will upload diagnostics soon, but I believe here are the relevant lines in the syslog

 

Aug 17 00:50:00 unRaid kernel: md: recovery thread: check P Q ...
Aug 17 00:50:00 unRaid kernel: md: using 1536k window, over a total of 3907018532 blocks.
Aug 17 00:50:02 unRaid sSMTP[21501]: Creating SSL connection to host
Aug 17 00:50:03 unRaid sSMTP[21501]: SSL connection using ECDHE-RSA-AES256-GCM-SHA384
Aug 17 00:50:05 unRaid sSMTP[21501]: Sent mail for [email protected] (221 2.0.0 Bye) uid=0 username=xxx outbytes=615
Aug 17 00:50:28 unRaid kernel: docker0: port 3(veth1e2a0ef) entered disabled state
Aug 17 00:50:28 unRaid kernel: veth9aadbfd: renamed from eth0
Aug 17 00:50:29 unRaid kernel: docker0: port 3(veth1e2a0ef) entered disabled state
Aug 17 00:50:29 unRaid kernel: device veth1e2a0ef left promiscuous mode
Aug 17 00:50:29 unRaid kernel: docker0: port 3(veth1e2a0ef) entered disabled state
Aug 17 00:50:29 unRaid kernel: docker0: port 3(veth59177d3) entered blocking state
Aug 17 00:50:29 unRaid kernel: docker0: port 3(veth59177d3) entered disabled state
Aug 17 00:50:29 unRaid kernel: device veth59177d3 entered promiscuous mode
Aug 17 00:50:29 unRaid kernel: docker0: port 3(veth59177d3) entered blocking state
Aug 17 00:50:29 unRaid kernel: docker0: port 3(veth59177d3) entered forwarding state
Aug 17 00:50:29 unRaid kernel: eth0: renamed from vethd759761
Aug 17 00:53:43 unRaid kernel: docker0: port 3(veth59177d3) entered disabled state
Aug 17 00:53:43 unRaid kernel: vethd759761: renamed from eth0
Aug 17 00:53:43 unRaid kernel: docker0: port 3(veth59177d3) entered disabled state
Aug 17 00:53:43 unRaid kernel: device veth59177d3 left promiscuous mode
Aug 17 00:53:43 unRaid kernel: docker0: port 3(veth59177d3) entered disabled state
Aug 17 00:53:43 unRaid kernel: docker0: port 3(veth5d084d4) entered blocking state
Aug 17 00:53:43 unRaid kernel: docker0: port 3(veth5d084d4) entered disabled state
Aug 17 00:53:43 unRaid kernel: device veth5d084d4 entered promiscuous mode
Aug 17 00:53:43 unRaid kernel: docker0: port 3(veth5d084d4) entered blocking state
Aug 17 00:53:43 unRaid kernel: docker0: port 3(veth5d084d4) entered forwarding state
Aug 17 00:53:43 unRaid kernel: eth0: renamed from vethec1c6e2
Aug 17 00:58:32 unRaid kernel: unregister_netdevice: waiting for lo to become free. Usage count = 1
Aug 17 00:58:42 unRaid kernel: unregister_netdevice: waiting for lo to become free. Usage count = 1
Aug 17 01:31:10 unRaid kernel: md: recovery thread: PQ incorrect, sector=488609960
Aug 17 01:36:01 unRaid sSMTP[12956]: Creating SSL connection to host
Aug 17 01:36:02 unRaid sSMTP[12956]: SSL connection using ECDHE-RSA-AES256-GCM-SHA384
Aug 17 01:36:04 unRaid sSMTP[12956]: Sent mail for [email protected] (221 2.0.0 Bye) uid=0 username=xxx outbytes=665
Aug 17 01:36:04 unRaid sSMTP[13103]: Creating SSL connection to host
Aug 17 01:36:05 unRaid sSMTP[13103]: SSL connection using ECDHE-RSA-AES256-GCM-SHA384
Aug 17 01:36:07 unRaid sSMTP[13103]: Sent mail for [email protected] (221 2.0.0 Bye) uid=0 username=xxx outbytes=646
Aug 17 01:36:07 unRaid sSMTP[13240]: Creating SSL connection to host
Aug 17 01:36:07 unRaid sSMTP[13240]: SSL connection using ECDHE-RSA-AES256-GCM-SHA384
Aug 17 01:36:09 unRaid sSMTP[13240]: Sent mail for [email protected] (221 2.0.0 Bye) uid=0 username=xxx outbytes=646
Aug 17 02:49:42 unRaid kernel: md: recovery thread: PQ incorrect, sector=1383333928
Aug 17 03:01:52 unRaid kernel: md: recovery thread: PQ incorrect, sector=1509687296
Aug 17 03:28:58 unRaid kernel: md: recovery thread: PQ incorrect, sector=1791754264
Aug 17 05:06:02 unRaid sSMTP[10272]: Creating SSL connection to host
Aug 17 05:06:02 unRaid sSMTP[10272]: SSL connection using ECDHE-RSA-AES256-GCM-SHA384
Aug 17 05:06:04 unRaid sSMTP[10272]: Sent mail for [email protected] (221 2.0.0 Bye) uid=0 username=xxx outbytes=646
Aug 17 05:36:02 unRaid sSMTP[10392]: Creating SSL connection to host
Aug 17 05:36:02 unRaid sSMTP[10392]: SSL connection using ECDHE-RSA-AES256-GCM-SHA384
Aug 17 05:36:04 unRaid sSMTP[10392]: Sent mail for [email protected] (221 2.0.0 Bye) uid=0 username=xxx outbytes=646
Aug 17 05:41:07 unRaid kernel: perf: interrupt took too long (3154 > 3133), lowering kernel.perf_event_max_sample_rate to 63000
Aug 17 06:06:25 unRaid root: /etc/libvirt: 926.3 MiB (971321344 bytes) trimmed
Aug 17 06:06:25 unRaid root: /var/lib/docker: 20.2 GiB (21720010752 bytes) trimmed
Aug 17 06:06:25 unRaid root: /mnt/cache: 380.7 GiB (408758845440 bytes) trimmed
Aug 17 08:19:22 unRaid kernel: mdcmd (59): spindown 4
Aug 17 08:19:23 unRaid kernel: mdcmd (60): spindown 5
Aug 17 08:19:24 unRaid kernel: mdcmd (61): spindown 6
Aug 17 10:33:06 unRaid kernel: md: recovery thread: PQ incorrect, sector=5764546632
Aug 17 11:14:01 unRaid kernel: mdcmd (62): spindown 2
Aug 17 11:14:02 unRaid kernel: mdcmd (63): spindown 7
Aug 17 12:01:16 unRaid kernel: mdcmd (64): spindown 6
Aug 17 13:23:48 unRaid kernel: md: sync done. time=45227sec
Aug 17 13:23:48 unRaid kernel: md: recovery thread: completion status: 0
Aug 17 13:24:01 unRaid sSMTP[12469]: Creating SSL connection to host
Aug 17 13:24:02 unRaid sSMTP[12469]: SSL connection using ECDHE-RSA-AES256-GCM-SHA384
Aug 17 13:24:07 unRaid sSMTP[12469]: Sent mail for [email protected] (221 2.0.0 Bye) uid=0 username=xxx outbytes=697

I can see the errors, the lines with:

unRaid kernel: md: recovery thread: PQ incorrect

Started the parity check around 1:00 AM, so the error came fairly late in the check. Parity check finished at ~1:00 PM.

 

Ideas?

Edited by CrimsonTyphoon
Link to comment

If you had zero parity errors and shutdown safely before migrating the disks, then you should have zero parity errors now. Over the years there have been a few occurrences of people having parity errors that they couldn't clear even after running a correcting parity check. And they were repeatable, usually a small number and always on the same sectors. I don't remember what the fix was if there was one. I think it must have ultimately been a controller problem.

 

I don't recall it every being reported with dual parity.

 

Have you done a memtest?

Link to comment
39 minutes ago, johnnie.black said:

Is the LSI controller working in RAID mode?

 

Yes.

 

Due to many conflicting reports on the SAS2208, I was able to get it for JBOD by simply executing a command in the CLI. However, I was just informed that I have a server crash with an unresponsive terminal. Thank god i have IPMI and I can reboot manually. Time to pull new syslog

 

Will update after new logs.

Link to comment
4 minutes ago, johnnie.black said:

RAID controller is a possible reason for the 5 parity errors, it may also make any hardware change more problematic, HBAs are preferred.

 

I have a aoc-sas2lp-mv8 that I was using with my old hardware. I think I'm going to swap in in tonight and see what happens with a new parity check. However, I don't think it will work with > 8 drives so I am in a little pickle here... :-/. Both cards have 8087 connectors so I can simply swap it in and see what happens.

 

Looked at the new syslog, nothing interesting to report I'm afraid.I'll upload shortly.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.