sooperjoe

Members
  • Posts

    6
  • Joined

  • Last visited

Converted

  • Gender
    Undisclosed

sooperjoe's Achievements

Noob

Noob (1/14)

0

Reputation

  1. Don't do that. 20.20.20.5 and .10 are not private addresses, you will cause yourself issues by using internet routable addresses like that. https://en.wikipedia.org/wiki/Private_network#Private_IPv4_address_spaces Ok, I changed it to 10.10.20.10 and .5. I also disabled the NIC and updated to 6.2 stable and everything seems to be working. Thanks for your help.
  2. I have a HP Proliant system running unRAID, which was previously connected to a switch using the 4 onboard gigabit ports. Recently, I wanted to improve speeds between my main PC and the server, so I bought a cheap set of 2 10GBe NICs on eBay (HP CONNECTX2). I installed one in the server, and one in my PC and connected them. I set up the new network interface in unRAID with an IP of 20.20.20.10, and gave my PC an IP of 20.20.20.5 in Windows on the new network card. This is outside of the main network's range (10.10.10.x). The connection seems to be fine, I can connect to the unRAID server from my PC and I get pretty much the write speeds that I was expecting (350-410MB/s writing to cache). The read speeds don't seem any better, but maybe I can find some tweaks or something to improve that. The problem that I'm having is that the unRAID server seems to be trying to use the new interface to do all of its internal work, like checking for updates for Plugins and Dockers. All of my Plugins show "no update" in the status column, and the Dockers all show that an update is available, but it fails when I try. So is there a better way for me to set up the 10GBe interface in unRAID that would solve these problems, or is there a way for me to tell unRAID to use the onboard interfaces for plugin/docker updates? Windows Settings: http://i.imgur.com/S9OJEIl.png unRAID Network Settings: http://i.imgur.com/nJcvXKn.png Thanks for your time!
  3. Sorry, I forgot to mention in the post. I have tried to do the scrub command in the webui and it will complete the full scrub (50GBish of the drive) without errors if the problem has not occurred yet. If I try it after the problem has occurred (when everything is broken and it has gone into the read-only mode), it just says aborted after 00:00:00, which I guess is to be expected. If I simply copy all of the contents of the cache drive to some other location, then reformat or reformat in XFS as the documentation suggests, and copy everything back, what are the chances that all of my VMs and Dockers will be fine? The docs make it seem like that procedure is not super reliable.
  4. Hi, I'm currently running v6.2beta21, but this has been a recurring issue for at least the past few beta versions if I remember correctly. What happens, is that the machine will run fine for 12-24 hours or sometimes even longer, but then I will come back and try to watch a movie on Plex or start up work on the VM I use as a development server and everything will be dead. The sites hosted on the VM will all show some error in the browser that I don't see anywhere else, my SSH windows to that machine will all be errored out, and the Plex stuff shows "Media cannot be located" or something along those lines whenever I try to open a show/movie. The problem, to me, appears to be an issue with a read or write operation to the cache drive (where all of the VMs and Plex, etc. is held), which then results in it being put into a read-only mode. This seems like a general hardware issue, but the drive shows 0 errors and always has. This has happened probably 100 times by now, and I have never seen anything aside from 0 errors in the cache drive row, or any other drives. When the system is up and working, it's flawless. So my first thought was that it's an issue with the HBA card or cables or even the hot-swap bays. I moved around the drives in the bays so that the cache drive was in the place of one of the 4TB data drives that had been in the system for months with no issues. When I restarted, everything was normal, but the issue still happened at what seems like the same rate. The problem doesn't show up as a drive error, even though that seems to be what is happening. If you guys don't think that it's the drive, I could try taking the SSD out of the hot swap bay and just plugging it into the straight SATA cable. I just don't see how it could be I/O because this always seems to happen when I'm not near my PC and when I am near my PC I am regularly running a lot of workload through that drive; it hosts databases, etc. What are the chances that it has an I/O error when I'm not using it, compared to when it's doing thousands of reads/writes per minute while I am working? Here is what I think is the "important" part of the syslog (I've also included the full thing at the bottom because I could definitely be wrong). This is where it seems the error starts. May 28 06:51:51 TOWER kernel: sd 1:0:3:0: [sde] tag#0 CDB: opcode=0x35 35 00 00 00 00 00 00 00 00 00 May 28 06:51:51 TOWER kernel: mpt2sas_cm0: sas_address(0x4433221104000000), phy(4) May 28 06:51:51 TOWER kernel: mpt2sas_cm0: enclosure_logical_id(0x500605b0079b2610),slot(4) May 28 06:51:51 TOWER kernel: mpt2sas_cm0: handle(0x000b), ioc_status(success)(0x0000), smid(19) May 28 06:51:51 TOWER kernel: mpt2sas_cm0: request_len(0), underflow(0), resid(0) May 28 06:51:51 TOWER kernel: mpt2sas_cm0: tag(65535), transfer_count(0), sc->result(0x00000000) May 28 06:51:51 TOWER kernel: mpt2sas_cm0: scsi_status(check condition)(0x02), scsi_state(autosense valid )(0x01) May 28 06:51:51 TOWER kernel: mpt2sas_cm0: [sense_key,asc,ascq]: [0x06,0x29,0x00], count(18) May 28 06:51:51 TOWER kernel: blk_update_request: 13 callbacks suppressed May 28 06:51:51 TOWER kernel: blk_update_request: I/O error, dev sde, sector 0 May 28 06:51:51 TOWER kernel: btrfs_dev_stat_print_on_error: 13 callbacks suppressed May 28 06:51:51 TOWER kernel: BTRFS error (device sde1): bdev /dev/sde1 errs: wr 788, rd 9, flush 1, corrupt 0, gen 0 May 28 06:51:51 TOWER kernel: BTRFS: error (device sde1) in write_all_supers:3620: errno=-5 IO failure (errors while submitting device barriers.) May 28 06:51:51 TOWER kernel: BTRFS info (device sde1): forced readonly May 28 06:51:51 TOWER kernel: ------------[ cut here ]------------ May 28 06:51:51 TOWER kernel: WARNING: CPU: 6 PID: 754 at fs/btrfs/tree-log.c:2936 btrfs_sync_log+0x7a3/0x9c5() May 28 06:51:51 TOWER kernel: BTRFS: Transaction aborted (error -5) May 28 06:51:51 TOWER kernel: Modules linked in: xt_CHECKSUM iptable_mangle ipt_REJECT nf_reject_ipv4 ebtable_filter ebtables vhost_net vhost macvtap macvlan xt_nat veth ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_nat_ipv4 iptable_filter ip_tables nf_nat md_mod tun bonding coretemp kvm_intel kvm ata_piix mpt3sas bnx2 hpsa raid_class scsi_transport_sas ipmi_si pcc_cpufreq acpi_cpufreq May 28 06:51:51 TOWER kernel: CPU: 6 PID: 754 Comm: qemu-system-x86 Tainted: G I 4.4.6-unRAID #1 May 28 06:51:51 TOWER kernel: Hardware name: HP ProLiant DL380 G6, BIOS P62 03/30/2010 May 28 06:51:51 TOWER kernel: 0000000000000000 ffff880224e27ca0 ffffffff813688da ffff880224e27ce8 May 28 06:51:51 TOWER kernel: 0000000000000b78 ffff880224e27cd8 ffffffff8104a28a ffffffff812f65d3 May 28 06:51:51 TOWER kernel: ffff8807087d6800 ffff8800979b7800 00000000fffffffb ffff8801f95a9800 May 28 06:51:51 TOWER kernel: Call Trace: May 28 06:51:51 TOWER kernel: [<ffffffff813688da>] dump_stack+0x61/0x7e May 28 06:51:51 TOWER kernel: [<ffffffff8104a28a>] warn_slowpath_common+0x8f/0xa8 May 28 06:51:51 TOWER kernel: [<ffffffff812f65d3>] ? btrfs_sync_log+0x7a3/0x9c5 May 28 06:51:51 TOWER kernel: [<ffffffff8104a2e6>] warn_slowpath_fmt+0x43/0x4b May 28 06:51:51 TOWER kernel: [<ffffffff812f65d3>] btrfs_sync_log+0x7a3/0x9c5 May 28 06:51:51 TOWER kernel: [<ffffffff812d3626>] btrfs_sync_file+0x23a/0x29e May 28 06:51:51 TOWER kernel: [<ffffffff812d3626>] ? btrfs_sync_file+0x23a/0x29e May 28 06:51:51 TOWER kernel: [<ffffffff8112d146>] vfs_fsync_range+0x87/0x99 May 28 06:51:51 TOWER kernel: [<ffffffff8112d16f>] vfs_fsync+0x17/0x19 May 28 06:51:51 TOWER kernel: [<ffffffff8112d19d>] do_fsync+0x2c/0x45 May 28 06:51:51 TOWER kernel: [<ffffffff8112d3ad>] SyS_fdatasync+0xe/0x12 May 28 06:51:51 TOWER kernel: [<ffffffff8161a0ae>] entry_SYSCALL_64_fastpath+0x12/0x6d May 28 06:51:51 TOWER kernel: ---[ end trace 800bc7cd3c709081 ]--- May 28 06:51:51 TOWER kernel: BTRFS: error (device sde1) in btrfs_sync_log:2936: errno=-5 IO failure May 28 06:51:53 TOWER shfs/user: shfs_create: open: /mnt/cache/Config/mongonew/diagnostic.data/metrics.interim.temp (30) Read-only file system May 28 06:52:23 TOWER kernel: loop: Write error at byte offset 3397902336, length 4096. May 28 06:52:23 TOWER kernel: blk_update_request: I/O error, dev loop0, sector 6636528 May 28 06:52:23 TOWER kernel: BTRFS error (device loop0): bdev /dev/loop0 errs: wr 2, rd 0, flush 0, corrupt 0, gen 0 May 28 06:54:31 TOWER shfs/user: shfs_write: write: (30) Read-only file system May 28 06:55:29 TOWER shfs/user: shfs_write: write: (30) Read-only file system Then there are a bunch of entries of that same "Read-only file system" error and then some more stuff: May 28 06:59:12 TOWER kernel: loop: Write error at byte offset 3515129856, length 4096. May 28 06:59:12 TOWER kernel: blk_update_request: I/O error, dev loop0, sector 6865488 May 28 06:59:12 TOWER kernel: loop: Write error at byte offset 3515260416, length 512. May 28 06:59:12 TOWER kernel: blk_update_request: I/O error, dev loop0, sector 6865743 May 28 06:59:12 TOWER kernel: loop: Write error at byte offset 3515390976, length 1024. May 28 06:59:12 TOWER kernel: blk_update_request: I/O error, dev loop0, sector 6865998 May 28 06:59:12 TOWER kernel: BTRFS error (device loop0): bdev /dev/loop0 errs: wr 3, rd 0, flush 0, corrupt 0, gen 0 May 28 06:59:12 TOWER kernel: loop: Write error at byte offset 3516403712, length 4096. May 28 06:59:12 TOWER kernel: blk_update_request: I/O error, dev loop0, sector 6867976 May 28 06:59:12 TOWER kernel: loop: Write error at byte offset 3516534272, length 512. May 28 06:59:12 TOWER kernel: blk_update_request: I/O error, dev loop0, sector 6868231 May 28 06:59:12 TOWER kernel: loop: Write error at byte offset 3516664832, length 1024. May 28 06:59:12 TOWER kernel: blk_update_request: I/O error, dev loop0, sector 6868486 May 28 06:59:12 TOWER kernel: BTRFS error (device loop0): bdev /dev/loop0 errs: wr 4, rd 0, flush 0, corrupt 0, gen 0 May 28 06:59:12 TOWER kernel: loop: Write error at byte offset 3515129856, length 4096. May 28 06:59:12 TOWER kernel: blk_update_request: I/O error, dev loop0, sector 6865488 May 28 06:59:12 TOWER kernel: loop: Write error at byte offset 3515260416, length 512. May 28 06:59:12 TOWER kernel: blk_update_request: I/O error, dev loop0, sector 6865743 May 28 06:59:12 TOWER kernel: loop: Write error at byte offset 3515390976, length 1024. May 28 06:59:12 TOWER kernel: blk_update_request: I/O error, dev loop0, sector 6865998 May 28 06:59:12 TOWER kernel: BTRFS error (device loop0): bdev /dev/loop0 errs: wr 5, rd 0, flush 0, corrupt 0, gen 0 May 28 06:59:12 TOWER kernel: loop: Write error at byte offset 3516403712, length 4096. May 28 06:59:12 TOWER kernel: blk_update_request: I/O error, dev loop0, sector 6867976 May 28 06:59:12 TOWER kernel: BTRFS error (device loop0): bdev /dev/loop0 errs: wr 6, rd 0, flush 0, corrupt 0, gen 0 May 28 06:59:12 TOWER kernel: BTRFS error (device loop0): bdev /dev/loop0 errs: wr 7, rd 0, flush 0, corrupt 0, gen 0 May 28 06:59:12 TOWER kernel: BTRFS error (device loop0): bdev /dev/loop0 errs: wr 8, rd 0, flush 0, corrupt 0, gen 0 May 28 06:59:12 TOWER kernel: BTRFS error (device loop0): bdev /dev/loop0 errs: wr 9, rd 0, flush 0, corrupt 0, gen 0 May 28 06:59:12 TOWER kernel: BTRFS error (device loop0): bdev /dev/loop0 errs: wr 10, rd 0, flush 0, corrupt 0, gen 0 May 28 06:59:12 TOWER kernel: BTRFS error (device loop0): bdev /dev/loop0 errs: wr 11, rd 0, flush 0, corrupt 0, gen 0 May 28 06:59:12 TOWER kernel: BTRFS error (device loop0): bdev /dev/loop0 errs: wr 12, rd 0, flush 0, corrupt 0, gen 0 May 28 06:59:12 TOWER kernel: BTRFS: error (device loop0) in btrfs_commit_transaction:2124: errno=-5 IO failure (Error while writing out transaction) May 28 06:59:12 TOWER kernel: BTRFS info (device loop0): forced readonly May 28 06:59:12 TOWER kernel: BTRFS warning (device loop0): Skipping commit of aborted transaction. May 28 06:59:12 TOWER kernel: ------------[ cut here ]------------ May 28 06:59:12 TOWER kernel: WARNING: CPU: 0 PID: 10860 at fs/btrfs/transaction.c:1746 cleanup_transaction+0x8f/0x24c() May 28 06:59:12 TOWER kernel: BTRFS: Transaction aborted (error -5) May 28 06:59:12 TOWER kernel: Modules linked in: xt_CHECKSUM iptable_mangle ipt_REJECT nf_reject_ipv4 ebtable_filter ebtables vhost_net vhost macvtap macvlan xt_nat veth ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_nat_ipv4 iptable_filter ip_tables nf_nat md_mod tun bonding coretemp kvm_intel kvm ata_piix mpt3sas bnx2 hpsa raid_class scsi_transport_sas ipmi_si pcc_cpufreq acpi_cpufreq May 28 06:59:12 TOWER kernel: CPU: 0 PID: 10860 Comm: btrfs-transacti Tainted: G W I 4.4.6-unRAID #1 May 28 06:59:12 TOWER kernel: Hardware name: HP ProLiant DL380 G6, BIOS P62 03/30/2010 May 28 06:59:12 TOWER kernel: 0000000000000000 ffff8806f079fcd0 ffffffff813688da ffff8806f079fd18 May 28 06:59:12 TOWER kernel: 00000000000006d2 ffff8806f079fd08 ffffffff8104a28a ffffffff812bff7b May 28 06:59:12 TOWER kernel: ffff880208c7f000 ffff88047b8b8a20 ffff88020855caa0 00000000fffffffb May 28 06:59:12 TOWER kernel: Call Trace: May 28 06:59:12 TOWER kernel: [<ffffffff813688da>] dump_stack+0x61/0x7e May 28 06:59:12 TOWER kernel: [<ffffffff8104a28a>] warn_slowpath_common+0x8f/0xa8 May 28 06:59:12 TOWER kernel: [<ffffffff812bff7b>] ? cleanup_transaction+0x8f/0x24c May 28 06:59:12 TOWER kernel: [<ffffffff8104a2e6>] warn_slowpath_fmt+0x43/0x4b May 28 06:59:12 TOWER kernel: [<ffffffff812bff7b>] cleanup_transaction+0x8f/0x24c May 28 06:59:12 TOWER kernel: [<ffffffff81075fd3>] ? wait_woken+0x6d/0x6d May 28 06:59:12 TOWER kernel: [<ffffffff81075b24>] ? __wake_up+0x3f/0x46 May 28 06:59:12 TOWER kernel: [<ffffffff812c11cd>] btrfs_commit_transaction+0x9c6/0x9e1 May 28 06:59:12 TOWER kernel: [<ffffffff812bcbaa>] transaction_kthread+0xfa/0x1cd May 28 06:59:12 TOWER kernel: [<ffffffff812bcbaa>] ? transaction_kthread+0xfa/0x1cd May 28 06:59:12 TOWER kernel: [<ffffffff812bcab0>] ? btrfs_cleanup_transaction+0x45e/0x45e May 28 06:59:12 TOWER kernel: [<ffffffff8105f870>] kthread+0xcd/0xd5 May 28 06:59:12 TOWER kernel: [<ffffffff8105f7a3>] ? kthread_worker_fn+0x137/0x137 May 28 06:59:12 TOWER kernel: [<ffffffff8161a3ff>] ret_from_fork+0x3f/0x70 May 28 06:59:12 TOWER kernel: [<ffffffff8105f7a3>] ? kthread_worker_fn+0x137/0x137 May 28 06:59:12 TOWER kernel: ---[ end trace 800bc7cd3c709082 ]--- May 28 06:59:12 TOWER kernel: BTRFS: error (device loop0) in cleanup_transaction:1746: errno=-5 IO failure May 28 06:59:12 TOWER kernel: BTRFS info (device loop0): delayed_refs has NO entry May 28 07:06:00 TOWER shfs/user: shfs_write: write: (30) Read-only file system May 28 07:06:00 TOWER shfs/user: shfs_write: write: (30) Read-only file system May 28 07:06:00 TOWER shfs/user: shfs_write: write: (30) Read-only file system May 28 07:06:00 TOWER shfs/user: shfs_write: write: (30) Read-only file system May 28 07:06:00 TOWER shfs/user: shfs_write: write: (30) Read-only file system May 28 07:06:00 TOWER shfs/user: shfs_write: write: (30) Read-only file system May 28 07:10:20 TOWER shfs/user: shfs_write: write: (30) Read-only file system May 28 07:10:20 TOWER shfs/user: shfs_write: write: (30) Read-only file system May 28 07:10:20 TOWER shfs/user: shfs_write: write: (30) Read-only file system May 28 07:10:20 TOWER shfs/user: shfs_write: write: (30) Read-only file system I tried to attach the full log from boot to the error, but it's way too large so I uploaded it here: https://mega.nz/#!XMtyWbRR!TO_UHT1biwgpugU4Vj99lXcCy-ropyEPR00eEWXta1U. I actually reboot the array after this, but none of it shows. Also, when I try to download the syslog, I just get a .zip with an empty txt file. I copied this from the syslog page of the webui. Some other info that might be useful: - The cache is a single drive, a Samsung 830 Series SSD - There are 4 data drives (3 data, 1 parity). All are 4TB WD Red, 2 Pros, 2 non-pros. - The system itself is a HP Proliant w/ 2 E5540s and 28gb of RAM - The HBA is a LSI 9200-8e which goes out via 2 SFF-8088 cables into a little adapter that goes from SFF-8088 to 8087 thing on in an expansion slot of a separate chassis, and then to two of these hot swap bays: http://www.amazon.com/Rosewill-5-25-Inch-3-5-Inch-Hot-swap-SATAIII/dp/B00DGZ42SM - The SFF/SATA cables and adapters are cheap Chinese things but again, what are the chances of these problems being related to I/O and not happening when an actual workload is going through them. Also, all of the data drives are in the same bays and have never had an issue. Thanks for your time, Joe