[SOLVED] Write speed dropped from 100 Mbps to 1 Mbps - UGH


Recommended Posts

I'm not what what happened, but was moving along fine with write around 100 Mbps and now I'm crawling :/

 

I have a cache and attaching/including other info.

 

Any thoughts?

 

root@Tower:~# ethtool eth0

Settings for eth0:

        Supported ports: [ TP ]

        Supported link modes:  10baseT/Half 10baseT/Full

                                100baseT/Half 100baseT/Full

                                1000baseT/Full

        Supported pause frame use: No

        Supports auto-negotiation: Yes

        Advertised link modes:  10baseT/Half 10baseT/Full

                                100baseT/Half 100baseT/Full

                                1000baseT/Full

        Advertised pause frame use: No

        Advertised auto-negotiation: Yes

        Speed: 1000Mb/s

        Duplex: Full

        Port: Twisted Pair

        PHYAD: 2

        Transceiver: internal

        Auto-negotiation: on

        MDI-X: on (auto)

        Supports Wake-on: pumbg

        Wake-on: g

        Current message level: 0x00000007 (7)

                              drv probe link

        Link detected: yes

root@Tower:~# ifconfig eth0

eth0: flags=4419<UP,BROADCAST,RUNNING,PROMISC,MULTICAST>  mtu 1500

        ether 18:03:73:33:53:c5  txqueuelen 1000  (Ethernet)

        RX packets 25750839  bytes 6922097620 (6.4 GiB)

        RX errors 0  dropped 5  overruns 0  frame 0

        TX packets 37747556  bytes 44072732949 (41.0 GiB)

        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

        device interrupt 20  memory 0xf7f00000-f7f20000

 

root@Tower:~# ifconfig eth0

eth0: flags=4419<UP,BROADCAST,RUNNING,PROMISC,MULTICAST>  mtu 1500

        ether 18:03:73:33:53:c5  txqueuelen 1000  (Ethernet)

        RX packets 25751424  bytes 6922191667 (6.4 GiB)

        RX errors 0  dropped 5  overruns 0  frame 0

        TX packets 37748172  bytes 44072789163 (41.0 GiB)

        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

        device interrupt 20  memory 0xf7f00000-f7f20000

 

root@Tower:~# hdparm -i /dev/sdc

 

/dev/sdc:

 

Model=TOSHIBA MD04ACA500, FwRev=FP2A, SerialNo=55B1K1N9FS9A

Config={ Fixed }

RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=0

BuffType=unknown, BuffSize=unknown, MaxMultSect=16, MultSect=16

CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=9767541168

IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}

PIO modes:  pio0 pio1 pio2 pio3 pio4

DMA modes:  sdma0 sdma1 sdma2 mdma0 mdma1 mdma2

UDMA modes: udma0 udma1 udma2 udma3 udma4 *udma5

AdvancedPM=yes: unknown setting WriteCache=enabled

Drive conforms to: Unspecified:  ATA/ATAPI-3,4,5,6,7

 

* signifies the current active mode

 

root@Tower:~# hdparm -tT /dev/sdc

 

/dev/sdc:

Timing cached reads:  22610 MB in  2.00 seconds = 11317.68 MB/sec

Timing buffered disk reads:  44 MB in  3.13 seconds =  14.07 MB/sec

root@Tower:~#

syslog.txt

Edited by jcarmi04
Link to comment

Eh, don't think I was copying using Cache previously, but copied directly to disk via the second method and still sitting at 1 Mbps. Any other thoughts or tests I should try? I've tested cabling and pretty much everything I can think of. I can do mc copies at 40-100 Mbps, so unRAID disk to disk seems okay, just not over the LAN. I tried a Windows to Windows copy over LAN and got 100 Mbps...

Link to comment
  • 1 month later...
  • 2 weeks later...

John_M: I had done A LOT of searches for Windows, unRAID, NAS, smb, etc solutions and just did a re-check again...got NUTHIN.

 

Couple new notes:

 

1. Tried a Win 10 to Win 10 transfer on my LAN and maintained 100Mbps.

 

2. Retried a disk to disk transfer on unRAID using mc and am getting significantly bad copy speeds (after the first file, when Windows usually gets that error, the speeds hover beteen 1-10Mbps).

 

3. Most interesting (possibly): Tried a FTP transfer and it apparently is suffering from the same failure, but was persistent (unlike Windows which must just time out) - see snippet of transfer log info below. I previously thought it was a SAMBA issue, but now am wondering what's up with my unRAID box :/

 

Status: Logged in

Status: Logged in

Status: Starting upload of C:\As Good As It Gets.mkv

Status: Retrieving directory listing of "/mnt/user/Movies"...

Status: Starting upload of C:\Armageddon.mkv

Status: Retrieving directory listing of "/mnt/user/Movies"...

Command: TYPE I

Response: 200 Switching to Binary mode.

Command: PASV

Response: 227 Entering Passive Mode ("LOCAL IP",134,195).

Command: LIST

Response: 150 Here comes the directory listing.

Response: 226 Directory send OK.

Command: PASV

Response: 227 Entering Passive Mode ("LOCAL IP",237,46).

Command: REST 1081856000

Response: 350 Restart position accepted (1081856000).

Command: STOR As Good As It Gets.mkv

Error: Connection timed out after 20 seconds of inactivity

Error: File transfer failed after transferring 786,432 bytes in 20 seconds

Command: TYPE I

Response: 200 Switching to Binary mode.

Command: PASV

Response: 227 Entering Passive Mode ("LOCAL IP",167,247).

Command: REST 5599938828

Response: 350 Restart position accepted (5599938828).

Command: STOR Armageddon.mkv

Error: Connection timed out after 20 seconds of inactivity

Error: File transfer failed after transferring 786,432 bytes in 20 seconds

Status: Disconnected from server

Status: Disconnected from server

Status: Connecting to "LOCAL IP:PORT"...

Status: Connection established, waiting for welcome message...

Status: Connecting to "LOCAL IP:PORT"...

Status: Connection established, waiting for welcome message...

Status: Insecure server, it does not support FTP over TLS.

Status: Insecure server, it does not support FTP over TLS.

Status: Logged in

Status: Logged in

Status: Starting upload of C:\As Good As It Gets.mkv

Status: Retrieving directory listing of "/mnt/user/Movies"...

Status: Starting upload of C:\Armageddon.mkv

Status: Retrieving directory listing of "/mnt/user/Movies"...

Status: File transfer successful, transferred 914,521,076 bytes in 27 seconds

Command: TYPE I

Response: 200 Switching to Binary mode.

Command: PASV

Response: 227 Entering Passive Mode ("LOCAL IP",173,250).

Command: LIST

Response: 150 Here comes the directory listing.

Response: 226 Directory send OK.

Command: PASV

Response: 227 Entering Passive Mode ("LOCAL IP",219,103).

Command: REST 1081856000

Response: 350 Restart position accepted (1081856000).

Command: STOR As Good As It Gets.mkv

Response: 150 Ok to send data.

Error: Connection timed out after 20 seconds of inactivity

Error: File transfer failed after transferring 837,550,080 bytes in 55 seconds

Status: Starting upload of C:\As Good As It Gets.mkv

Status: Retrieving directory listing of "/mnt/user/Movies"...

Command: PASV

Response: 227 Entering Passive Mode ("LOCAL IP",225,93).

Command: REST 0

Response: 350 Restart position accepted (0).

Command: LIST

Response: 150 Here comes the directory listing.

Response: 226 Directory send OK.

Command: PASV

Response: 227 Entering Passive Mode ("LOCAL IP",52,174).

Command: REST 1916796928

Response: 350 Restart position accepted (1916796928).

Command: STOR As Good As It Gets.mkv

Error: Connection timed out after 20 seconds of inactivity

Error: File transfer failed after transferring 786,432 bytes in 20 seconds

Status: Disconnected from server

Status: Connecting to "LOCAL IP:PORT"...

Status: Connection established, waiting for welcome message...

Status: Insecure server, it does not support FTP over TLS.

Status: Logged in

Status: Starting upload of C:\As Good As It Gets.mkv

Status: Retrieving directory listing of "/mnt/user/Movies"...

Status: File transfer successful, transferred 2,284,476,619 bytes in 72 seconds

Status: Retrieving directory listing of "/mnt/user/Movies"...

Status: Directory listing of "/mnt/user/Movies" successful

Link to comment

trurl/All: Log file attached. I've got notes from a few tests I ran today below and it appears more to do with my unRAID server's ability to refresh/buffer/something, as you can see from my initial notes. Happy to do more specific testing if anyone has any ideas.

 

Started testing at 8:33a ET. Copied a 6 GB .mkv file to the server from my Win10 PC and transferred fine (folder 1). After copying finished, waited 1 minute and tried the same file again which failed (folder 2). Waited 5 minutes and repeated the test at increasing 1 minute intervals until I determined the wait between copying the same 6 GB file for success is approximately 3 minutes. This success is only present for copying 1 large file, as failures (timeouts, Windows error) presents into the second file.

 

Folder #s

1-833-copied successfully

2-1 min later-failed

3-wait 5 mins then copied successfully

4-2 min later-failed

5-wait 5 mins then copied successfully

6-3 min later-success successfully

7-3 min later-success successfully

8-3 min later tried to copy 5 .mkv files (25 GB)-1 file copied, per usual, and the rest failed

 

9-Same share (possibly different disk) to same share-1 file copied successfully

10-3 min later-Same share (possibly different disk) to same share-1 file copied successfully - started at 500 Mbps

 

11-Different share (probably a different disk) to that original share-1 file (7 GB) copied successfully

12-3 min later-Different share (probably a different disk) to that original share-1 file (7 GB) copied successfully

 

13-Since I forgot above, 3 min later-Same share (possibly different disk) to same share-5 files-failed

 

14-MC-Same share (possibly different disk) to same share-1 file copied successfully

15-3 min later-MC-Same share (possibly different disk) to same share-1 file copied successfully

 

MULT-copied 5 files from Win10 PC same share (possibly different disk) to same share-failed

tower118-syslog-20161227-1014.zip

Link to comment

I fought with a problem(s) of occasional very slow reads from the array.  Never had an issue with slow copies to the array.  But after playing, googling, experimenting, cursing with/at the problem, I think I finally have it fixed.

 

The first one was that I had an out-of-date 'Network Adapter' driver installed in my Win7 computer. (This is actually the last thing that I found...)

 

The second one was that I had revisit by the 'Green Bar of Death' in Windows Explorer.  (This was a problem back around 2009-2010.)  I decided to try a replacement file manager and picked Explorer++.  (Mainly because it does not really install itself into Windows.  It is a stand-alone program that you simply click on the executable and it runs...)  I suspect that MS snuck a security update into Windows Explorer into the past few months which broke it again. 

 

While your problem is not identical, you may want to investigate these possibilities. 

 

 

Link to comment

Thanks, Frank1940. I actually have tried two separate Win10 PCs and it's present in both. Plus, with the failures doing MC copies I have to feel this is some type of unRAID issues, yes? I have tried using Teracopy and FTP transfers, which does not necessarily mimic Explorer++ it does replicate the problem using a slightly different method with the same (crummy) outcome. The only benefit of using FTP vs all other methods was FTP persisted in copying (retries) and the files eventually copied, whereas Win10 and unRAID both "timed out".

Link to comment

Had a breakthrough, though, can't speak to why I was failing. I thought my issue might be related to RAM or possibly a failing Cache Drive. In the main directory that I tried copying the files to, I turned use of Cache Drive to "Yes" (think I turned it off as part of troubleshooting) and it appears to be holding 100 Mbps + transferring over my network. I'm not sure what would've changed ... and still not sure why I was failing with Cache Drive set to "No". However, will mark as resolved for the time being and revisit if issue(s) rear again.

 

Thanks, all!

Link to comment

I wouldn't abandon trying to isolate what's happening.  Using the cache is likely just masking the issue ... your writes are always to the cache, so you don't "see" the real issue when mover runs -- and is likely writing VERY slow to the array.

 

I didn't ready all of the details here, but it sounds like you have a failing parity drive, which is resulting in VERY slow writes to the array.

 

You could try a couple of things ...

 

(a)  replace the parity drive and see if that resolves things

 

(b)  if you don't mind running "at risk" (i.e. unprotected), you could do a New Config without assigning a parity drive, and see if writes to the disk shares are at full speed.

 

Link to comment

garycase: Glad you wrote, cause I think you're right (Parity). I looked at little more and can't even tell - for certain - if my files are being copied from my Cache to a permanent copy with a Parity backup.

 

If I complete a New Config what might result other than proving its with my Parity? I'm imagining I'll need to replace the Parity, yes? Or, will running a New Config possibly "fix" whatever is wrong?

 

Ultimately I'd like to reuse or replace the drive, so want to dig into what's wrong as best as I can...

Link to comment

If you do a New Config and simply do not include a parity drive, then your system should speed up a LOT.  It won't impact any of your data -- it simply will mean you're not fault-tolerant (i.e. you're running "at risk").

 

If that indeed speeds things up as I expect, that will pretty much confirm that the issue is your parity drive.  I'd then get a new drive; assign it as parity, and let the system do a parity sync -- and things should then be back to normal.  [Just toss the old parity drive in that case]

 

If things don't speed up, then the issue is something else -- and you'll need to do a bit more testing to try & isolate it.  But I think it's very likely you'll find that the New Config w/out parity will fix the issue.

 

Link to comment

Alright, so ran a New Config anddddd....nothing. Did not assign a Parity, tried to copy to the primary share that had been failing (a share composed of 5 disks) and got the same Windows error. I then reassigned my Parity, performed a Parity Check and got no errors from it.

 

At this point I thought perhaps it was a single drive of my 11 disks (13 total in the unRAID server if counting Parity and Cache) - specifically one of that share. Results all over the map:

 

-Movie Share composed of Disks 1, 5, 8, 10, and 11. Tried copying to the Share again and failed. Created an individual share per disk and Disks 1, 5, 8, and 11 failed with the Windows error. Disk 10 copied successfully though.

 

-Copies to Disks 2, 3, and 6 failed as well.

 

-Copies to Disks 4, 7, and 9 were successful.

 

-I then retried a few disks and the only result that changed was disk 6 had a successful copy.

 

Any thoughts??

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.