48TB server, under 35 watts idle (now under 24 watts)


Recommended Posts

(Title has been updated from 40TB under 30 watts to 48TB under 35 watts.)

 

(Update: motherboard replacement saved over 10 watts at idle, to under 24.)

 

Greetings fellow unRAIDers.  I've built a number of unRAID servers over the years, but this is the first one that doesn't have the drives in separate enclosures from the mainboard.  Most of those other builds, including my current primary server, use the SGI SE3016 3U 16-bay SAS enclosure.

 

This build puts low idle power consumption as a top priority.  My current primary server (44TB protected) draws a bit over 60 watts at idle.  I realized I'd have to lose the external enclosures to hit my goal of less than one watt (at idle) per terabyte (protected).

 

There were a few challenges involved with this build, which I'll go into later, but first, the build details.

 

unRAID Version: 6.1.8

 

Mainboard / Processor: ASRock Rack C2550D4I

  http://www.newegg.com/Product/Product.aspx?Item=N82E16813157419

 

Memory: 1x 16GB ECC UDIMM (DDR3-1600), brand unknown

  http://www.ebay.com/itm/Single-16GB-ECC-UDIMM-DDR3-1600MHz-for-Xeon-E3-xxxx-V3-Atom-C2000-based-only-/191591469461

 

Chassis: iStarUSA D-214-MATX

  http://www.newegg.com/Product/Product.aspx?Item=N82E16811165572

 

Drive Cages: 2x three-drive cages from Ark IPS-2U235 chassis

  http://www.arktechinc.net/server-case/2u-rackmount/2u-ipc-2u235.html

 

Power Supply: Seasonic SS-400FL2

  http://www.newegg.com/Product/Product.aspx?Item=N82E16817151097

 

Drives: 6x Seagate 8TB Archive v2 (ST8000AS0002)

  http://www.newegg.com/Product/Product.aspx?Item=N82E16822178748

 

Fans: 2x Xigmatek XSF-F8252

  http://www.newegg.com/Product/Product.aspx?Item=N82E16835233083

 

Cables:

  6x StarTech 12-Inch Latching Round SATA Cable (LSATARND12)

    http://www.amazon.com/gp/product/B006K260T2

  2x Monoprice 18inch Molex to 3x SATA Power Cable

    http://www.amazon.com/gp/product/B002HR8XE4

 

Admittedly, the memory was expensive.  A pair of 8GB ECC UDIMMs could be had for more like $100.  I already had the 16GB ECC UDIMM (four actually).

 

All drives were purchased for $175 - $200 each.  I also got a good price on the mainboard.

 

So not including the memory and the drives (and fans/cables):

  $200 - mainboard

  $120 - power supply

    $60 - chassis

--------

  $380 - TOTAL

 

There were three main difficulties with this build.  First, I wanted ECC memory, because I won't be using a cache drive, but will instead achieve a similar effect with "lots of memory" (and bumping up dirty_ratio).  Without that requirement, there are many more mainboard options.

 

Second difficulty was wanting to use a shallow-depth chassis.  Most all of my computer hardware is rackmount, but the racks themselves are only 16" - 18" deep.  I have a few 20" - 26" deep chassis installed, and could have easily found a workable 2U chassis within that depth range.  I ended up with one with a depth of 17" (just checked, it's more like 15.5" deep), and yes it is on the tight side.

 

The final difficulty involved the Seagate 8TB Archive drives, and those missing center side mounting points.  Most drive cages want to use that mounting point; I had to dig into my spare parts to find some that could use just the two end points.

 

Here are the numbers.

 

Power Consumption:

  idle (all drives spun down): 29.1 watts

  reading (only one drive spun up): 35.0 watts

  writing (only two drives spun up): 46.5 watts

  all drives spun up: 58.2 watts

  peak at boot up: <140 watts

 

I haven't run a Parity Check since switching to the latest power supply, but with the previous one I saw numbers ranging from just under 80 watts at the beginning to just over 70 watts at the end.

 

After a fair bit of Tunables testing, I arrived at disk settings that yielded this Parity Check time and speed:

  Duration: 15 hours, 6 minutes, 38 seconds.

  Average speed: 147.1 MB/s

 

Drive temperatures during a Parity Check get into the low-to-mid-30Cs.

 

I just checked write speeds to the one drive that's only 5/8 full and got pretty consistent speeds around 60MB/s (after the memory filled up).  Probably for a near-empty drive it'd be closer to 80MB/s, and I'd guess about 50MB/s for a near-full drive.

 

Future directions . . . I'll bump up the memory to at least 32GB, then see how well that works as a cache-drive substitute, going all the way to 64GB if needed.  And, I have a couple more 8TB drives; I plan to install a 3-bay hotswap cage in the two ODD bays, then add at least one more data drive, and probably a "warm spare" (leaving the third bay open for "guest" drives).  To avoid needing to use hdparm to get the spare drive to spin down, I'm going to try just making it the cache drive (but not actually enabling caching), then it should spin down 15 minutes after booting and stay that way.  This sever will be on 24x365, only being taken down for maintenance (hence the priority on low idle power consumption).

 

I'll post some pictures in the next post . . . is that 800-pixel maximum width still in force?  Also, even at 768 pixels wide, my pictures are over 100KB each, so to avoid the 192KB-per-post maximum, I guess I'll be making a new post for each picture.

Edited by bobkart
motherboard replacement and improved performance
Link to comment
  • Replies 52
  • Created
  • Last Reply

Top Posters In This Topic

Feel free to post your power consumption numbers here (along with protected capacities).

 

I'm especially interested to see who has gotten under one watt idle per protected terabyte, and what they did to achieve it.

 

And thanks in advance to any who might offer feedback/criticism of my build.

 

In that regard, right off I'll say that I understand the additional risk that using such large drives incurs (with over 15 hours for a data or parity rebuild).  This is where a dual-parity solution would shine.  I do have multiple backup servers, and will typically rsync to them weekly.  And if/when a rebuild is needed, I'd likely rsync again before firing it off.

 

EDIT: the wattmeter is now showing 29.0 watts, with the occasional dip to 28.9 watts.

 

Link to comment

Very nice build => not only did you get under 1w/TB idle consumption, but it's got a CPU with enough "oomph" to run v6 quite nicely ... PassMark well over 2000.    [Not a "high end" setup by any means, but certainly respectable.  I suspect you could also stay under 1w/TB using the octa-core version of the Avoton as well, which would bump the PassMark over 3500.]

 

Link to comment

Nice build,  what are your temps like during parity check?

 

I typically see drive temperatures in the low-to-mid-30Cs during a Parity check.  An earlier version of the build used quieter, lower-powered Fractal Design fans, but drive temperatures started to approach 40C during a Parity Check, so I took a step up on the fans.

 

I suspect you could also stay under 1w/TB using the octa-core version of the Avoton as well,...

 

I actually had the eight-core version of this board in the build for a while, having received it by accident.  Unfortunately I had to return it due to random crashes/reboots.  And you're right, the difference was only 2-3 watts at idle (and about $100).

 

I liked the PSU; it's not oversized and has a great efficiency.

 

It's the most efficient "traditional" PSU at those power levels that I've found yet . . . not that I've tried very many; most of my other builds use a picoPSU (the 80-watt version typically), which are very efficient at low power levels.  I tried an M4-ATX (250 watts) with this build and it didn't quite beat the Seasonic (about 2 watts difference).

 

Which reminds me of another data point asked for in the UCD Guidelines: peak power consumption at boot.  Hard to know for sure what this is, due to the numbers flying by so quickly, but I've watched the meter many times and have yet to see it hit 140 watts (edited into initial post).  With 2-3 more drives being added at some point, that ought to climb by 40-60 watts, so I expect it would still be under 200 watts peak.

Link to comment

 

Which reminds me of another data point asked for in the UCD Gudelines: peak power consumption at boot.  Hard to know for sure what this is, due to the numbers flying by so quickly, but I've watched the meter many times and have yet to see it hit 140 watts.  With 2-3 more drives being added at some point, that ought to climb by 40-60 watts, so I expect it would still be under 200 watts peak.

 

With "80 Plus" power supply units, overload isn't that dangerous; a 400w PSU can easily handle 500w of load with a lower efficiency. Even if your 12v line peaks a few amperes greater than the specifications, you should be ok.

Link to comment
With "80 Plus" power supply units, overload isn't that dangerous; a 400w PSU can easily handle 500w of load with a lower efficiency. Even if your 12v line peaks a few amperes greater than the specifications, you should be ok.

 

Good to know, thanks.  I wish they made a 200-watt version of the power supply I used for this build.

 

I just added a second 16GB ECC UDIMM, and idle power consumption only bumped up by 2-3 tenths of a watt.  Next I'll turn on cache_dirs and see how much of the 32GB that uses (under 10% without it).

Link to comment

Looks like 32GB ECC memory plus a dirty_ratio of 75 will work fine (for me anyway) as a substitute for a cache drive.  I can write 20GB files at full GbE speeds.

 

I just switched to using this server as primary and my previous primary as "primary backup".  Rsyncing more frequently of course until I'm satisfied with the reliability of this one.

 

 

Link to comment

When writing a large file, like 20-30GB what do the write speeds start at and finish at? I always try to increase my write speeds but eventually gave in and without a cache drive writing to the array is as good as it's going to get. Here's my results when doing a 31GB write to the array. I also like your system. Must be very quiet.

 

I have 64GB of memory and the copy slows down around %23. But I do have a couple VM's running at all times as well. I know there was extensive speed testing on the Seagate Archive drives. Are you able to perform a speedtest.sh?

 

Thanks.

 

testcopy.png.65cd009c23a296257bd18401c1a82378.png

Link to comment

My write speeds started and stopped right at full GbE speeds (~112 Mb/s), on the 18GiB file I used to confirm that my "RAM cache" idea was working (previously it had dropped to ~60 Mb/s early into the copy, as you report).

 

Haven't heard of speedtest.sh, and a search didn't reveal anything.

 

I'll get you a screenshot of a 20GiB file write later tonight.  Probably 30Gib would just push past my dirty_ratio.

 

I suspect you haven't increased your dirty_ratio past the 20% that it comes at stock.  I'm using 75%.

Link to comment

This curve wasn't nearly as flat as the one I got for the 18GiB write, but it should serve the desired purpose.

 

That's a 24GiB write, getting 60% through at pretty close to maximum GbE speed, then it starts to waver, not really sure what that's about, but it's definitely not due to running out of cache before the end of the write.

 

Here's something of a writeup on the Linux VM caching parameters:

 

https://lonesysadmin.net/2013/12/22/better-linux-disk-caching-performance-vm-dirty_ratio/

 

And now that I look more closely at that information, the wavering I sometimes see could be due to a lower-than-desirable vm.dirty_expire_centisecs setting . . . stock is 30 seconds, whereas I could (possibly) benefit from something more like 300 seconds (while acknowledging the increased risk of data loss such a high setting creates).  The server *is* powered through a decent UPS so that's one big source of concern pretty well eliminated.  Short of something on the mainboard going south, power supply failure would seem to be the next obvious potential problem point (and where redundant power supplies come in handy!).

 

Will be experimenting with that other VM setting next, although the current performance is quite acceptable.

24GiB-copy-to-Luna-103.png.7e602bf903beed42c8dcd0a186115640.png

Link to comment

My write speeds started and stopped right at full GbE speeds (~112 Mb/s), on the 18GiB file I used to confirm that my "RAM cache" idea was working (previously it had dropped to ~60 Mb/s early into the copy, as you report).

 

Haven't heard of speedtest.sh, and a search didn't reveal anything.

 

I'll get you a screenshot of a 20GiB file write later tonight.  Probably 30Gib would just push past my dirty_ratio.

 

I suspect you haven't increased your dirty_ratio past the 20% that it comes at stock.  I'm using 75%.

 

No, I haven't messed around with dirty_ratio. Only so much you can squeeze out of a lemon. I don't suspect anything would give me much better results then what it is now. If anything another 5 seconds of max write speed.

Link to comment

Well, it's up to you of course, but with 64GB of memory, I'm pretty sure if you increase your dirty_ratio from 20 to even just 50, you'll see a dramatic increase in how much of that 31GB files gets copied before the speed drops to drive speeds (including parity overhead).  I.e 20% of 64GB is 12.8GB, whereas 50% of 64GB is 32GB.  So the operating system will allow up to 32GB of memory to be filled with "yet to be written to disk" data before it slows down the incoming data, versus just 12.8GB.

 

Granted if you have other stuff using memory, there might not be that much available to use as cache.

Link to comment

I measured the depth of the case and it's more like 15.5" as opposed to the 17" in the specs (corrected in initial post).  Maybe they measured the handles too?

 

I just wrote a 39GiB file to the server and captured the progress.  There is a fair bit of "wavering" starting at 10% into the copy, but the speeds still stay well above disk-write speeds (including parity overhead) until 90% into the copy, where it finally drops to the mid-40MB/s range.

Luna-39GiB-write-175.png.4400ac9101ac7e745882c916818b7fa8.png

Link to comment

I measured the depth of the case and it's more like 15.5" as opposed to the 17" in the specs (corrected in initial post).  Maybe they measured the handles too?

 

I just wrote a 39GiB file to the server and captured the progress.  There is a fair bit of "wavering" starting at 10% into the copy, but the speeds still stay well above disk-write speeds (including parity overhead) until 90% into the copy, where it finally drops to the mid-40MB/s range.

 

Very nice. Thanks for performing a test. So the average max speed of one of those Archive drives without a cache drive writing to the array would be 40-60MB/sec. Agree? I'm wondering if you had a faster parity drive - not the inexpensive Seagate model - you maybe would see beginning write speeds at 110MB/sec?

 

I used that MD WRITE command and got 111MB/sec for the entire 31GB file copy.

 

testcopy2.png.b562a8eaf25a29ffe7f8ba6755680194.png

Link to comment

For that write, the target drive was nearly full.  So the mid-40MB/s range represents the lowest speeds these drives can be written to (including parity overhead).  I had estimated 50MB/s in my initial post, so I wasn't too far off.  That estimate was based on seeing ~60MB/s when writing to a 5/8 full drive.  I still estimate ~80MB/s for a nearly-empty drive, although I don't have one installed to test.  So just using very round numbers, averaging a low of 40MB/s with a high of 80MB/s will yield 60MB/s; that's my current best-guess "round number" on an average write speed with these drives.

 

In my experience the parity overhead creates *at least* a 2x reduction in write speed.  So to get 110MB/s of "actual" disk writing speed (not counting any caching benefit), you'd need at least 220MB/s native write speed for the drives involved . . . not just the parity drive but the target data drive as well.  I suppose a 7200RPM version of the these drives could hit that, at least on the outer cylinders, but I doubt that the inner cylinders could be written with that kind of speed: 7200RPM is about 22% more than 5900RPM, so "all other things being equal", having observed numbers just over 200MB/s during parity check/syncs, that's around 244MB/s when sped up to 7200RPM.  Again "in my experience", the inner tracks take a 2x hit on speed compared to the outer tracks, so that's 122MB/s, divided again by two for the parity overhead is something more like 60MB/s.

 

Note that I do see ~112MB/s at the start of my writes, and for "sufficiently small" files (limit seems to be around the mid-30GB mark), that speed persists throughout the copy (notwithstanding the wavering that occasionally drops it into the 80-90MB/s range), due to the large amount of memory I've made available for cache (24GiB).

 

I'm interested to hear more about this MD WRITE command . . . I briefly searched for information on it but did not turn anything up.  Do you have a link to where this is described?

Link to comment

 

Thanks for the link; I was able to track down a good description of how it works from there:

 

http://lime-technology.com/forum/index.php?topic=34521.msg375905#msg375905

 

I may consider using Turbo Write Mode for my backup servers, since I don't configure those drives to spin down (since the servers are only on long enough to finish the rsync).

Link to comment

In my experience the parity overhead creates *at least* a 2x reduction in write speed.  So to get 110MB/s of "actual" disk writing speed (not counting any caching benefit), you'd need at least 220MB/s native write speed for the drives involved . . .

 

I did recently some tests with the fastest disks I had available and there seems to be some ceiling around the 75-80MB/s mark.

 

I used Toshibas DT01ACA100, these disks start at >200MB/s, I tested with empty disks and approximately 70% occupied, in both cases write speed was ~75MB/s, closer to 80MB/s when they were empty but except for a few spikes it wouldn’t go above that.

 

I tried with various md_write_limit values and never got more than that, note that if I use SSDs for the array I can get >100MB/s sustained, so it’s not a server limit.

empty.png.b6f1349463ff90ca34232eb83849d20c.png

58aad7d8cb8f8_70_full.png.e2b2c3a3743b7d33a85b8e4ffddcbb93.png

Link to comment

Those are interesting test results, JB.  I see your point: at 70% full, throughput should be more like half of whatever it is at the outer cylinders, yet you're seeing almost identical write speeds from both tests.

 

Another potential explanation is that those 1TB drives you did this test with don't actually have less data (and thus less throughput) on the inner cylinders compared to the outer cylinders.  How were the Parity Check speeds using (only) these drives?  If they were consistent throughout the pass (assuming no other bottlenecks), that helps confirm the above hypothesis.

 

I'll be adding a seventh drive (and three-bay hotswap cage) at some point, then I'll be able to check best-case write speeds for these drives.

 

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.