Woke up to a red X


Recommended Posts

Diagnostics already includes syslog so no reason to post it unless it covers a significantly different timeframe than diagnostics for some reason.

 

Moving files off the drive is not necessarily the best response to a disabled disk. It just makes the system work all of the disks a lot harder emulating the disabled disk. You are going to have to rebuild the disk in any case, the rebuild will have all of the files, and the rebuild process will work all of the disks again.

 

If there are any critical files on the disk that you don't have backed up, then it does make sense to copy them from the disabled disk to another system.

 

 

Link to comment
13 minutes ago, johnnie.black said:

Loos like the typical SASLP problem, disk is probably fine but since it dropped offline you'll need to reboot and grab new diags to get a SMART report.

On the main array tab that disk is showing 6 errors..

13 minutes ago, trurl said:

Diagnostics already includes syslog so no reason to post it unless it covers a significantly different timeframe than diagnostics for some reason.

 

Moving files off the drive is not necessarily the best response to a disabled disk. It just makes the system work all of the disks a lot harder emulating the disabled disk. You are going to have to rebuild the disk in any case, the rebuild will have all of the files, and the rebuild process will work all of the disks again.

 

If there are any critical files on the disk that you don't have backed up, then it does make sense to copy them from the disabled disk to another system.

 

 

attached syslog just to be thorough. 

Link to comment

Disk looks fine, you can rebuild to it.

 

With regards to the SASLP issue, it can happen again (and probably will) in the future, these sometimes help:

 

-disable vt-d if you don't need it

-look for a board bios update

-use a different PCIe slot if availble

 

If none of these help consider replacing it with an LSI controller.

Link to comment

Well after your suggestion from this post

I moved one controller to the PCI 3 slot. Not satisfied I bought a HP220 controller for that slot and removed one SASLP controller. My speeds have since been doubled from my earlier parity checks, 

How do you know it is from the SASLP and not from the HP220 controller?

 

Right now my data build is moving at 105 mb/s

Edited by Harro
Link to comment
8 minutes ago, Harro said:

How do you know it is from the SASLP and not from the HP220 controller?

 

Because it's in the syslog:

 

Quote

Apr 13 04:40:43 Tower kernel: sas: Enter sas_scsi_recover_host busy: 1 failed: 1
Apr 13 04:40:43 Tower kernel: sas: trying to find task 0xffff88011d09ff00
Apr 13 04:40:43 Tower kernel: sas: sas_scsi_find_task: aborting task 0xffff88011d09ff00
Apr 13 04:40:43 Tower kernel: sas: sas_scsi_find_task: task 0xffff88011d09ff00 is aborted
Apr 13 04:40:43 Tower kernel: sas: sas_eh_handle_sas_errors: task 0xffff88011d09ff00 is aborted
Apr 13 04:40:43 Tower kernel: sas: ata11: end_device-1:2: cmd error handler
Apr 13 04:40:43 Tower kernel: sas: ata9: end_device-1:0: dev error handler
Apr 13 04:40:43 Tower kernel: sas: ata10: end_device-1:1: dev error handler
Apr 13 04:40:43 Tower kernel: sas: ata11: end_device-1:2: dev error handler
Apr 13 04:40:43 Tower kernel: ata11.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Apr 13 04:40:43 Tower kernel: ata11.00: failed command: READ NATIVE MAX ADDRESS EXT
Apr 13 04:40:43 Tower kernel: ata11.00: cmd 27/00:00:00:00:00/00:00:00:00:00/40 tag 12
Apr 13 04:40:43 Tower kernel:         res 40/00:00:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout)
Apr 13 04:40:43 Tower kernel: ata11.00: status: { DRDY }
Apr 13 04:40:43 Tower kernel: ata11: hard resetting link
Apr 13 04:40:45 Tower kernel: drivers/scsi/mvsas/mv_sas.c 1435:mvs_I_T_nexus_reset for device[2]:rc= 0
Apr 13 04:40:45 Tower kernel: mvsas 0000:02:00.0: Phy3 : No sig fis
Apr 13 04:40:49 Tower kernel: drivers/scsi/mvsas/mv_sas.c 1870:Release slot [0] tag[0], task [ffff8803db862900]:
Apr 13 04:40:49 Tower kernel: sas: sas_ata_task_done: SAS error 8a

 

Mvsas is the driver used by both the SASLP and the SAS2LP.

Link to comment
  • 2 weeks later...
On 4/13/2017 at 9:56 AM, johnnie.black said:

 

Because it's in the syslog:

 

 

Mvsas is the driver used by both the SASLP and the SAS2LP.

I updated plugins for Dyn etc.. and after update I have run into this again. This is the same drive as this post was created about. Question I have is what would be a good controller to replace this  SUPERMICRO AOC-SASLP-MV8 PCI-Express x4 Low Profile SAS RAID Controller with. I am only running 4 ports off of this controller so a 4 port would be ok.

Apr 24 12:03:25 Tower kernel: sas: Enter sas_scsi_recover_host busy: 1 failed: 1
Apr 24 12:03:25 Tower kernel: sas: trying to find task 0xffff88014d7b7700
Apr 24 12:03:25 Tower kernel: sas: sas_scsi_find_task: aborting task 0xffff88014d7b7700
Apr 24 12:03:25 Tower kernel: sas: sas_scsi_find_task: task 0xffff88014d7b7700 is aborted
Apr 24 12:03:25 Tower kernel: sas: sas_eh_handle_sas_errors: task 0xffff88014d7b7700 is aborted
Apr 24 12:03:25 Tower kernel: sas: ata12: end_device-2:3: cmd error handler
Apr 24 12:03:25 Tower kernel: sas: ata9: end_device-2:0: dev error handler
Apr 24 12:03:25 Tower kernel: sas: ata10: end_device-2:1: dev error handler
Apr 24 12:03:25 Tower kernel: sas: ata11: end_device-2:2: dev error handler
Apr 24 12:03:25 Tower kernel: sas: ata12: end_device-2:3: dev error handler
Apr 24 12:03:25 Tower kernel: ata12.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Apr 24 12:03:25 Tower kernel: ata12.00: failed command: READ NATIVE MAX ADDRESS EXT
Apr 24 12:03:25 Tower kernel: ata12.00: cmd 27/00:00:00:00:00/00:00:00:00:00/40 tag 5
Apr 24 12:03:25 Tower kernel:         res 40/00:00:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout)
Apr 24 12:03:25 Tower kernel: ata12.00: status: { DRDY }
Apr 24 12:03:25 Tower kernel: ata12: hard resetting link
Apr 24 12:03:28 Tower kernel: drivers/scsi/mvsas/mv_sas.c 1435:mvs_I_T_nexus_reset for device[3]:rc= 0
Apr 24 12:03:28 Tower kernel: sas: sas_ata_task_done: SAS error 8a
Apr 24 12:03:28 Tower kernel: ------------[ cut here ]------------

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.