dAigo Posted November 15, 2016 Share Posted November 15, 2016 Partition format: GUI shows "Partition format: unknown", while it should be "GPT: 4K-aligned". SMART Info Back in 6.2 Beta 18, unRAID started support for NVMe devices as cache/pool-disk. Beta 22 got the latest version of smartmontools (6.5) Smartmontools supports NVMe starting from version 6.5. Please note, that currently NVMe support is considered as experimental. While the CLI output of smartctl definitly improved, GUI still has no SMART info. Which is ok, for an "experimental feature". It even gives errors in the console, while browsing through the Disk-Info of the GUI, which was reported HERE, but got no answer it seems. I have not seen any hint in the 6.3 notes, so I asume, nothing will change. Probably low priority. I think the issue is a wrong smartctl command issued through the WebGUI. In my case, the NVMe disk is the cache disk and the gui identifies it as "nvme0n1", which is not wrong, but does not work with smartmontools... root@unRAID:~# smartctl -x /dev/nvme0n1 smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.4.30-unRAID] (local build) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Number: INTEL SSDPE2MW400G4 Serial Number: CVCQ5130003F400CGN Firmware Version: 8EV10171 PCI Vendor/Subsystem ID: 0x8086 IEEE OUI Identifier: 0x5cd2e4 Controller ID: 0 Number of Namespaces: 1 Namespace 1 Size/Capacity: 400,088,457,216 [400 GB] Namespace 1 Formatted LBA Size: 512 Local Time is: Tue Nov 15 20:21:12 2016 CET Firmware Updates (0x02): 1 Slot Optional Admin Commands (0x0006): Format Frmw_DL Optional NVM Commands (0x0006): Wr_Unc DS_Mngmt Maximum Data Transfer Size: 32 Pages Supported Power States St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat 0 + 25.00W - - 0 0 0 0 0 0 Supported LBA Sizes (NSID 0x1) Id Fmt Data Metadt Rel_Perf 0 + 512 0 2 1 - 512 8 2 2 - 512 16 2 3 - 4096 0 0 4 - 4096 8 0 5 - 4096 64 0 6 - 4096 128 0 === START OF SMART DATA SECTION === Read NVMe SMART/Health Information failed: NVMe Status 0x02 Don't ask me why, but if you use the "NVMe character device (ex: /dev/nvme0)" instead of the "namespace block device (ex: /dev/nvme0n1)" there is a lot more information. Like "SMART overall-health self-assessment test result: PASSED" root@unRAID:~# smartctl -x /dev/nvme0 smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.4.30-unRAID] (local build) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Number: INTEL SSDPE2MW400G4 Serial Number: CVCQ5130003F400CGN Firmware Version: 8EV10171 PCI Vendor/Subsystem ID: 0x8086 IEEE OUI Identifier: 0x5cd2e4 Controller ID: 0 Number of Namespaces: 1 Namespace 1 Size/Capacity: 400,088,457,216 [400 GB] Namespace 1 Formatted LBA Size: 512 Local Time is: Tue Nov 15 20:37:58 2016 CET Firmware Updates (0x02): 1 Slot Optional Admin Commands (0x0006): Format Frmw_DL Optional NVM Commands (0x0006): Wr_Unc DS_Mngmt Maximum Data Transfer Size: 32 Pages Supported Power States St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat 0 + 25.00W - - 0 0 0 0 0 0 Supported LBA Sizes (NSID 0x1) Id Fmt Data Metadt Rel_Perf 0 + 512 0 2 1 - 512 8 2 2 - 512 16 2 3 - 4096 0 0 4 - 4096 8 0 5 - 4096 64 0 6 - 4096 128 0 === START OF SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED SMART/Health Information (NVMe Log 0x02, NSID 0xffffffff) Critical Warning: 0x00 Temperature: 24 Celsius Available Spare: 100% Available Spare Threshold: 10% Percentage Used: 1% Data Units Read: 18,412,666 [9.42 TB] Data Units Written: 18,429,957 [9.43 TB] Host Read Commands: 224,748,225 Host Write Commands: 233,072,991 Controller Busy Time: 0 Power Cycles: 152 Power On Hours: 8,784 Unsafe Shutdowns: 1 Media and Data Integrity Errors: 0 Error Information Log Entries: 0 Error Information (NVMe Log 0x01, max 64 entries) Num ErrCount SQId CmdId Status PELoc LBA NSID VS 0 2 1 - 0x400c - 0 - - 1 1 1 - 0x400c - 0 - - According to NVME Specs (Page 92-94) the output does contain all "mandatory" information. (so it "should work" regardless of the vendor...) Its probably a PITA, but I guess there are not many common information between SATA/IDE and NVMe Smart Infos... which means the GUI needs some reworking in that regard. I guess "SMART overall-health self-assessment test result", "Critical Warning", "Media and Data Integrity Errors" and "Error Information Log Entries" would be most usefull in terms of health info and things that could raise an alert/notification. Depending on the amount of work it needs, maybe monitoring "Spare Threshold" as an indication of a "soon to fail" drive (like reallocated events/sectors). For science: I changed the "$port" variable in "smartinfo.php" to a hardcoded "nvme0", ignoring any POST values. See attachments for the result... not that bad for the quickest and most dirty solution I could think of unraid-smart-20161115-2020.zip unraid-smart-20161115-2220_with_nvme0.zip Quote Link to comment
dAigo Posted November 15, 2016 Author Share Posted November 15, 2016 I almost forgot: "Print vendor specific NVMe log pages" is on the list auf 6.6 smartmontools. Not sure if the vendors will find a common way, but at least for current Intel NVMe SSDs, there is a nice reference: Intel® Solid-State Drive DC P3700 Series Specifications (Page 26) Quote Link to comment
RobJ Posted November 16, 2016 Share Posted November 16, 2016 I'm interpreting your results a little differently, and I could be wrong, but it looks like the report may have aborted in both cases, more obviously in the first case. From what I read, the symbol /dev/nvme0 is the broadcast address, for all devices associated with nvme0, and /dev/nvme0n1 is the specific device. That results in different report results. More importantly, I don't see the device type listed in the identity info, so I think you need to specify that with the -d option, -d nvme. Try using the command smartctl -a -d nvme /dev/nvme0n1. If you prefer, use -x instead of -a. Quote Link to comment
dAigo Posted November 16, 2016 Author Share Posted November 16, 2016 I'm interpreting your results a little differently, and I could be wrong, but it looks like the report may have aborted in both cases, more obviously in the first case. From what I read, the symbol /dev/nvme0 is the broadcast address, for all devices associated with nvme0, and /dev/nvme0n1 is the specific device. That results in different report results. Thats why I said "don't ask me why" it works I am under the same impression as you are. Maybe its a smartmontools bug (experimental after all) but clearly nvm0 gives correct results in regard of the nvme-logs that are specified in the official specs. Currently the GUI ALWAYS states "Unavailable - disk must be spun up" under "Last SMART Results". But I am writng this on a vm thats running on the cache disk, so its NOT spun down. That also changed with nvm0, but the buttons for smart tests were still greyed out. I am not sure, that nvme devices even works the same as old devices. It was designed for flash memory, so most of the old SMART mechanics are useless, because sectors and read/write operations are handled differently. Not sure a "Self Test" makes sense on a flash drive due to that reason... why running a self test, if the specs makes it mandatory to log every error in a log? As long as there is free "Spare" space on the disk, "bad/worn out" flash cells should be "replaced". More importantly, I don't see the device type listed in the identity info, so I think you need to specify that with the -d option, -d nvme. Try using the command smartctl -a -d nvme /dev/nvme0n1. If you prefer, use -x instead of -a. Same result, nothing changed as far as I can see. root@unRAID:~# smartctl -a -d nvme /dev/nvme0n1 smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.4.30-unRAID] (local build) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Number: INTEL SSDPE2MW400G4 Serial Number: CVCQ5130003F400CGN Firmware Version: 8EV10171 PCI Vendor/Subsystem ID: 0x8086 IEEE OUI Identifier: 0x5cd2e4 Controller ID: 0 Number of Namespaces: 1 Namespace 1 Size/Capacity: 400,088,457,216 [400 GB] Namespace 1 Formatted LBA Size: 512 Local Time is: Wed Nov 16 07:06:07 2016 CET Firmware Updates (0x02): 1 Slot Optional Admin Commands (0x0006): Format Frmw_DL Optional NVM Commands (0x0006): Wr_Unc DS_Mngmt Maximum Data Transfer Size: 32 Pages Supported Power States St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat 0 + 25.00W - - 0 0 0 0 0 0 Supported LBA Sizes (NSID 0x1) Id Fmt Data Metadt Rel_Perf 0 + 512 0 2 1 - 512 8 2 2 - 512 16 2 3 - 4096 0 0 4 - 4096 8 0 5 - 4096 64 0 6 - 4096 128 0 === START OF SMART DATA SECTION === Read NVMe SMART/Health Information failed: NVMe Status 0x02 root@unRAID:~# smartctl -a -d nvme /dev/nvme0 smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.4.30-unRAID] (local build) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Number: INTEL SSDPE2MW400G4 Serial Number: CVCQ5130003F400CGN Firmware Version: 8EV10171 PCI Vendor/Subsystem ID: 0x8086 IEEE OUI Identifier: 0x5cd2e4 Controller ID: 0 Number of Namespaces: 1 Namespace 1 Size/Capacity: 400,088,457,216 [400 GB] Namespace 1 Formatted LBA Size: 512 Local Time is: Wed Nov 16 07:06:11 2016 CET Firmware Updates (0x02): 1 Slot Optional Admin Commands (0x0006): Format Frmw_DL Optional NVM Commands (0x0006): Wr_Unc DS_Mngmt Maximum Data Transfer Size: 32 Pages Supported Power States St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat 0 + 25.00W - - 0 0 0 0 0 0 Supported LBA Sizes (NSID 0x1) Id Fmt Data Metadt Rel_Perf 0 + 512 0 2 1 - 512 8 2 2 - 512 16 2 3 - 4096 0 0 4 - 4096 8 0 5 - 4096 64 0 6 - 4096 128 0 === START OF SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED SMART/Health Information (NVMe Log 0x02, NSID 0xffffffff) Critical Warning: 0x00 Temperature: 23 Celsius Available Spare: 100% Available Spare Threshold: 10% Percentage Used: 1% Data Units Read: 18,416,807 [9.42 TB] Data Units Written: 18,433,008 [9.43 TB] Host Read Commands: 224,801,960 Host Write Commands: 233,126,889 Controller Busy Time: 0 Power Cycles: 152 Power On Hours: 8,794 Unsafe Shutdowns: 1 Media and Data Integrity Errors: 0 Error Information Log Entries: 0 Error Information (NVMe Log 0x01, max 64 entries) Num ErrCount SQId CmdId Status PELoc LBA NSID VS 0 2 1 - 0x400c - 0 - - 1 1 1 - 0x400c - 0 - - Quote Link to comment
RobJ Posted November 16, 2016 Share Posted November 16, 2016 Thank you for trying that! Apparently it auto-recognized the nvme type, and didn't need it specified. Wish some of the other types could do that. I believe we're going to need a newer smartmontools, fully capable of reading all of the SMART info. Trying to use the vendor-specific log pages is a non-starter I think, because it means custom programming for each vendor. And if SMART history is any guide, it will mean custom changes for the different models too, even from the same vendor. The Intel document you linked shows that a true SMART attribute table is available, and that's what all SMART software is designed for. Hopefully a newer smartctl will handle it correctly. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.