[SOLVED] HELP: Upgraded Parity - Now Super Slow Writes To Array


DZMM

Recommended Posts

I've just spent the last couple of days upgrading my parity drive, but now it's finished I can't write to the array e.g. in mc I'm getting speeds of 30KB/s if I'm lucky, or the files don't move at all.  Even doing a fix common problems scan is taking forever - not an extended test, just a normal one.  Stopping/starting dockers takes an age.  Collecting diagnostics never completed via the webui and even via SSH it took around 15 minutes to complete....

 

VMs work fine because they are on an unassigned disk I assume.

 

I've tried rebooting which didn't help.

 

What I did was:

 

  1. precleared new parity disk
  2. changed config and selected new parity disk and added old one to array
  3. rebuilt parity - tried as much as possible to not do new writes to array

 

Is this just because I've got a new parity drive and I need to give the system time to 'settle'?  When I click on the parity disk to start a SMART exteneded self-test the page doesn't load, but it does for my other disks.

 

Help please.  My drive specs are in my sig.

highlander-diagnostics-20171017-1047.zip

Edited by DZMM
Link to comment
39 minutes ago, DZMM said:

I just spotted at the bottom of my webui it's stuck saying 'starting services', so somethings not loading properly??

 

That happens sometimes and it's no cause for concern, just a display problem.

 

You appear to have an old browser window open spamming your log with wrong csrf tokens, diagnostics also show some strange activity on your disks, can you post a screenshot of the main page with the reads/writes toggle set to speed?

 

 

Link to comment
2 minutes ago, johnnie.black said:

 

That happens sometimes and it's no cause for concern, just a display problem.

 

You appear to have an old browser window open spamming your log with wrong csrf tokens, diagnostics also show some strange activity on your disks, can you post a screenshot of the main page with the reads/writes toggle set to speed?

 

 

I spotted the csrf errors and killed a browser session I had on my phone and also removed the preclear and advanced buttons plugins just in case.    All was fine and I was getting decent transfers in mc, so I thought I'd reboot just to make sure and it's slowed to a crawl again.

 

At the time of the screenshot I'm trying to move 1TB of files from disk 5 to 6 which is moving at 435KB/s in mc....

main_speed.png

Link to comment

Now it's full of these:

 

Oct 17 12:06:54 Highlander nginx: 2017/10/17 12:06:54 [error] 8091#8091: *9112 FastCGI sent in stderr: "Primary script unknown" while reading response header from upstream, client: 172.30.12.13, server: , request: "POST /plugins/advanced.buttons/AdvancedButtons.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock:", host: "172.30.12.2", referrer: "http://172.30.12.2/Shares/Share?name=plex_sync"
Oct 17 12:06:54 Highlander nginx: 2017/10/17 12:06:54 [error] 8091#8091: *8598 FastCGI sent in stderr: "Primary script unknown" while reading response header from upstream, client: 172.30.12.13, server: , request: "POST /plugins/advanced.buttons/AdvancedButtons.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock:", host: "172.30.12.2", referrer: "http://172.30.12.2/Shares/Share?name=plex_sync"
Oct 17 12:06:55 Highlander nginx: 2017/10/17 12:06:55 [error] 8091#8091: *9112 FastCGI sent in stderr: "Primary script unknown" while reading response header from upstream, client: 172.30.12.13, server: , request: "POST /plugins/advanced.buttons/AdvancedButtons.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock:", host: "172.30.12.2", referrer: "http://172.30.12.2/Shares/Share?name=plex_sync"

 

Maybe Plex sync? It's doing a parity check, speed will already be lower, if also doing a Plex sync it's going to be really slow.
 

Link to comment

172.30.12.13 is my wife's laptop and plex_sync is a share I've created a bind mount to move background syncs off my small drive onto my large array e.g. when I'm syncing large amounts of videos to plex cloud sync:

 

mount --bind /mnt/user/plex_sync/ "/mnt/cache/appdata/plex/Library/Application Support/Plex Media Server/Cache/Transcode/Sync+"

I think my wife's laptop must have been viewing this share in maybe an old browser session (it mentions advanced buttons again).  Rebooted her laptop and server and all seems ok now.

 

I can't believe a few browser tabs would cause these problems - have to remember to never log in on a PC I can't get access to!

 

Thanks for the help

highlander-diagnostics-20171017-1235.zip

Link to comment

I stopped all dockers that download, move files etc and the transfers and the parity speed 'shot' up to around 60MB/s.  I think spotted that kodi was scanning music files (about 30k) and when I stopped this it went up to expected speeds of 120MB/s.

 

I then resumed one putty transfer and the speed went down to around a MB/s, and the transfer (disk 4-->5) went at around 20MB/s.  I stopped the transfer and the parity speed went back up to the 100+ MB/s.

 

Something's not right as disk activity never seemed to impact parity checks before.  I'm going to let it finish the parity check before I start writing again as it's throwing up a lot of errors - 2000 errors and it's only got to 35GB checked....will report back in tomorrow with an update on what happens when the errors are corrected, unless you have any ideas now.  

 

I've also started the SMART test on all my disks.

Link to comment
13 minutes ago, trurl said:

Parity checks, disk I/O, and SMART tests are all going to impact each other. Don't see how it could be otherwise. Maybe you just never tried this much I/O during parity check.

 

Agree, especially since your parity disk is now a shingled drive

Edited by johnnie.black
Link to comment
11 minutes ago, trurl said:

 

 

Parity checks, disk I/O, and SMART tests are all going to impact each other. Don't see how it could be otherwise. Maybe you just never tried this much I/O during parity check.

Probably - that's why i wondered in my first post if I needed to let the system 'settle'.

 

I've stopped all array activity that I can while it does the parity check - only array activity will be tonight when the family watch a few shows, which will be just reads not writes.  I'm curious to see how many errors the parity check finds - 2021 so far at 48GB checked

Link to comment
50 minutes ago, DZMM said:

Probably - that's why i wondered in my first post if I needed to let the system 'settle'.

 

I've stopped all array activity that I can while it does the parity check - only array activity will be tonight when the family watch a few shows, which will be just reads not writes.  I'm curious to see how many errors the parity check finds - 2021 so far at 48GB checked

You shouldn't be getting any parity errors if you rebuilt parity after changing the configuration.

  • Like 1
Link to comment
Just now, trurl said:

You shouldn't be getting any parity errors if you rebuilt parity after changing the configuration.

I've just restarted the check (only lost an hour) as  all the activity I was doing wasn't helping - it hasn't found any errors yet, so hopefully it'll be ok.

 

I'm just hoping all is well after the check finishes as it's been hard not writing to the server for the last couple of days!

Link to comment

Update:

 

Parity check completed in 19hr 17 min at an average of 115MB/s which I think is about right.  It threw up 569 errors which I don't think it was supposed to after a parity rebuild?

59e71c652c347_FireShotCapture31-Highlander_Main-http___172_30_12.2_Main.thumb.png.e2f35a01872c00a03b10a0389ff822e2.png

I'm now trying to move some files between disks to do some organising. I was doing this before with my last parity drive with turbo write on and it was flying; now it's barely moving.  Look at my disk history - it keeps peaking and then dropping off.  

 

Even creating diagnostics took about 90 seconds.

 

Any ideas what's wrong?  I've got turbo write on.  Do I have a duff parity disk and need to return it,  or did something go wrong with the initial build (hence the errors)?  It did the parity check in the expected time, so the disk seems to meeting its performance envelope.

59e71d99b197b_FireShotCapture32-Highlander_Stats-http___172_30_12.2_Stats.thumb.png.5990d318adfc4b19d0deb28b57d895c7.png

highlander-diagnostics-20171018-1024.zip

Edited by DZMM
Link to comment
32 minutes ago, DZMM said:

Update: reads are just as bad.

 

That means the issue is not parity related.

 

The diagnostics you posted again show multiple reads and writes on several disks at the same time, all disks except parity were reading and there were simultaneous writes to disks 2, 4, 5 and 6 (and parity obviously), very low performance is expected with so much simultaneous activity.

Link to comment
13 minutes ago, johnnie.black said:

 

The diagnostics you posted again show multiple reads and writes on several disks at the same time, all disks except parity were reading and there were simultaneous writes to disks 2, 4, 5 and 6 (and parity obviously), very low performance is expected with so much simultaneous activity.

I disagree with the last bit.  Before adding the new parity I had more transfers than above going on at the same time as I've been organising my shares AND deluge + nzbget downloading AND people watching shows, movies etc and no problems at all. 

 

Check out the latest diagnostics - I've just rebooted (I couldn't shutdown using the menu or SSH BTW - had to use the case power button) as I wanted to check my cables.  Now 'all' I've got going on is a parity check and deluge & nzbget trying to start, and the parity check can only manage 198.8KB/s - over 460 days to complete....

 

My machine never had any slowdowns like this before - yes, individual transfers would slow down if I had a lot of concurrent ones, but the overall speed would still be high and the machine would never slow to a halt - particularly when I've not even initiated anything like in the diags attached, where the machine has just rebooted and has just auto-started dockers and a parity check i.e. is in a 'idle' state.  My old X300 6TB parity checks never went over 17 hours or so for an ETA, regardless of what I was doing.

 

This is the 2nd 8TB archive I've had in a week (first one was DOA).  Everyone else seems to have no problems with them, but it's looking like I'm the odd one out and I need to try and get a refund or try a 3rd.

FireShot Capture 35 - Highlander_Main - http___172.30.12.2_Main.png

highlander-diagnostics-20171018-1131.zip

Edited by DZMM
Link to comment

It's difficult to quantify what's normal speed with simultaneous writes, in my experience unRAID doesn't deal well with simultaneous writes and I avoid them at all cost since I've always experienced low performance when doing them , also don't forget your new parity is shingled, and while that usually has no noticeable impact on sequential writes, much lower write performance is expect for random writes, and if your normal usage includes a lot of random and simultaneous writes a shingle parity may not be the best option for you.

Link to comment

I don't think my problem is as simple as that, even though everything seems to point to the parity drive as I've just changed it, as that surely the archive drive can handle doing a parity check and deluge running at 1MB/s like in the last diags?  That's all that was occurring (and nzbget initialising)

 

There's other problems going on.  Similarly to not bring able to shutdown using commands, I've just tried to stop the array to remove the parity (might as well as can't rely on it at the moment) and retest, but I can't.  I've had to use the case power button to stop and change the array to not auto start to change the config

Link to comment
2 hours ago, johnnie.black said:

It's difficult to quantify what's normal speed with simultaneous writes, in my experience unRAID doesn't deal well with simultaneous writes and I avoid them at all cost since I've always experienced low performance when doing them , also don't forget your new parity is shingled, and while that usually has no noticeable impact on sequential writes, much lower write performance is expect for random writes, and if your normal usage includes a lot of random and simultaneous writes a shingle parity may not be the best option for you.

 

It doesn't normally - I'm moving a lot of files between drives and shares around to take advantage of the extra array drive (old parity) allowing me to group files better to reduce spin-ups.  Normal write activity is normally just deluge/nzbget running at around 2MB/s (crappy ADSL2+ connection) - my VMs are on an unassigned disk so activity there rarely touches the array.

 

I'm going to finish the big move job while I've disconnected the 8TB archive drive, and then I'll rebuild parity afterwards when I've finished moving files in a day or so.  If I find the write (and even read) performance to be bad once normal usage resumes, I'll either move the archive drive to the array and get a faster N300 for Parity, or make more use of my cache pool for writes.  The crappy site I bought the drive from charges a 20% restocking fee so I'm stuck with it now - another reason to use Amazon next time!

Link to comment

Ok, I finished manually rebalancing my drives without the parity drive installed and then I re-added the parity. 

 

The build went fine but my write performance is still bad.   At the moment, the only major activity is:

  1. Kodi scanning music library on disk 4 - R
  2. Deluge_VPN downloading to disk 3 - RW
  3. nzbget downloading to disk 3 - RW
  4. wife watching tv on kodi while she works - RW (write is on cache for livetv buffer)

I think maybe the problem is 2 and 3 where I have over 300 deluge torrents in the queue and 3078 queued nzb items i.e. lots of small constant writes and checks going on.  

 

What I've decided to do is use my cache drive for all writes and stop being so stingy about TBW, and I've modified @Squid script to run only run the mover at a custom threshold to activate turbo mode when it does.  This should eliminate any parity write speed issues and allow me to continue expanding with Seagate Archives, until something else becomes more cost-effective.  I think when I put the next one in though, I'll make it the parity and move the current one to the array just in case this one is duff.

 

 

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.