Pauven

unraid-tunables-tester.sh - A New Utility to Optimize unRAID md_* Tunables

719 posts in this topic

This utility is named:  unraid-tunables-tester.sh

 

The current version is 2.2 and is attached at the bottom of this post.  I will maintain this post with future versions (if there are any).

 

 

VERSION HISTORY:

# V2.2: Added support for md_sync_window values down to 8 bytes,
#       Added a extra-low value special pass to FULLAUTO,
#       Fixed a bug that didn't restore values after testing - by Pauven 08/28/2013
# V2.1: Added support for md_sync_window values down to 128 bytes,
#       Fixed a typo on the FULLAUTO option - by Pauven 08/28/2013
# V2.0: Changed the test method from a "time to process bytes" to a "bytes 
#       processed in time" algorithm, lowering CPU utilization during testing,
#       Updated menus to reflect time based options instead of % options,
#       Revamped FULLAUTO with an optimized 2-hour 2-pass process,
#       Added Best Bang for the Buck sizing recommendation,
#       Added logic to autocancel Parity Checks in progress,
#       Added a check to make sure the array is Started  - by Pauven 08/25/2013
# v1.1: Increased update frequency 1000x to improve result accuracy,
#       Polished the formatting of the output data,
#       Various menu improvements and minor logic tweaks,
#       Added menu options for Manual Start/End Overrides,
#       Updated logic for identifying the best result,
#       Extended the range of the FULLAUTO test to 2944,
#       Added a memory use calculation to the report, 
#       Added option to write new params to disk.cfg - by Pauven 08/23/2013
# v1.0: Initial Version - by Pauven 08/21/2013

 

 

EXECUTIVE SUMMARY:

This is a utility that runs very short partial parity checks with different values for the unRAID Tunable parameters md_sync_window, md_write_limit, and md_num_stripes, and reports on the relative performance for each set of values.  Adjusting these values can improve system performance, particularly Parity Check speed, and this utility helps you find the right values for your system.

 

Users can either manually select the test value ranges and test types, or alternatively choose a Fully Automatic mode that runs an algorithm designed to zero in on the best values for your system.

 

Users don't need to know any command line parameters, as all prompts are provided at runtime with friendly guidance and some safety checks.

 

 

SUMMARY:

Since unRAID servers can be built in a limitless number of configurations, it is impossible for Tom and Lime-Technology to know what tunable parameters are correct for each system.  Different amounts of memory and various hardware components (especially HD controllers) directly affect what values work best for your system.  To play it safe, Lime-Technology delivers unRAID with 'safe' stock values that should work with any system, including servers with only 512MB RAM.

 

But how is a user to know what values to use on their server?

 

This utility addresses that problem by testing the three tunable parameters:

  • md_num_stripes
  • md_write_limit
  • md_sync_window

Each test is performed by automatically setting the three values and running a partial Non-Correcting Parity Check, typically less than 1% of a full Parity Check.  By running just a short section of a Parity Check before stopping it, this utility can test multiple values in relatively quick succession (certainly quicker than running a full Parity Check or doing this process manually). 

 

There are no command line parameters, the entire utility is driven through a user prompt system.

 

Each test is timed down to the millisecond, which is important when running shorter tests, so you can determine which set of values are appropriate for your system.

 

This utility can operate in two modes: Manual and Fully Automatic.

  • In Manual mode, you can choose the md_sync_window starting and ending values, the number of increments to test between those values, and the duration of each test.
     
  • In Fully Automatic mode, you simply start it and walk away.  The utility will make two separate passes through the values, honing in on the values that work the best for your server.  Fully Automatic mode is easy and produces great results, and now only takes 2.1 hours to run.

Regardless of what type of tests your run, the output is saved to a file named TunablesReport.txt, which lives in the same directory you install the utility.

 

While this utility tests changes to all three tunable parameters, these changes are not permanent.  If you wish to make the settings permanent, you have to chose your preferred values from the report, and manually enter them on the Settings > Disk Settings menu page in unRAID.

 

Additionally, after the tests are completed, the utility sets the tunable values back to unRAID stock values (for safety, in case you forget about setting them).  A reboot will return you to your previously selected values, as will hitting Apply on the Settings > Disk Settings menu page.

 

In case you're wondering, the formula for assigning the three values is of my own design.  It assigns md_num_stripes as approximately 11% bigger than md_write_limit + md_sync_window, rounded to the nearest testing interval.  md_write_limit is also set to the same value as md_sync_window for all values beyond 768 bytes.  Based upon your test parameters (primarily the Interval setting), the md_num_stripes value will calculate differently.  As far as I am aware my logic works okay, but this may have to be revisited in the future if new understandings are gained on how these three values correlate.  There are no published hard and fast rules for how to set the three values together.

 

 

OBLIGATORY WARNINGS:

Yup, here's my CYA prose, but since it is for your benefit, I suggest you read it.

 

Outside of writing the results report file (most likely to your flash drive), this utility does not do any writing to the server.  The Parity Checks are all performed in a read-only, non-correcting fashion.

 

But that doesn't mean something can't go horribly wrong.  For one, simply using this utility may stress your server to the breaking point.  Weak hardware may meet an early demise.  While the utility tries to guide you towards safe selections, there's nothing stopping you from running 2561 Extreme level tests (each of which reads 4% of the array) for a resulting test that is longer than 100 full Parity Checks!  Even in Fully Automatic mode, the server will be reading from the disk for the equivalent of about 1.5 full Parity Checks, and those reads will be focused on the first 5% of your hard drives. 

 

All array drives will be spinning simultaneously (smaller drives won't spin down like a normal Parity Check permits) and heat will build up in your system.  Ensure you have good cooling.

 

Running these tests, especially Fully Automatic, may be harder on your system than a full Parity Check.

 

You have to decide for yourself which tests are appropriate for your server and your comfort level.  If you are unsure, the default values are a pretty safe way to go. 

 

And if you decide after starting a test that you want to abort it, just hit CTRL-C on your keyboard.  If you do this, the Parity Check will most likely still be running, but you can Cancel it through the GUI.

 

Another issue that can crop is is out of memory errors.  The three unRAID tunable values are directly related to memory allocation to the md subsystem.  Some users have reported Kernel OOPS and Out Of Memory conditions when adjusting the unRAID Tunables, though it seems these users are often running many add-ons and plug-ins that compete for memory. 

 

This utility is capable of pushing memory utilization extremely high, especially in Fully Automatic mode, which scans a very large range of assignable values beyond what you may rationally consider assigning.

 

Typically, running out of memory is not a fatal event as long as you are not writing to your array. If you are writing to your array when a memory error occurs, data loss may occur!

 

The best advice is to not use your server at all during the test, and to disable 3rd party scripts, plug-ins, add-ons and yes even GUI/menu replacements - something made easier with unRAID's new Safe Boot feature.

 

One last caution:  If you have less than 4GB of RAM, this utility may not be for you.  That goes doubly if you are running a barebones, lightweight 512MB server, which should probably stay at the default Tunable values. This utility was designed and tested on a server with 4GB, and ran there without any issues, but you may run out of memory faster and easier if you have less memory to start with.

 

 

INSTALLATION:

Installation is simple. 

[*]Download the file unraid-tunables-tester.sh.v2_2.txt (current version at the bottom of this post)

[*]Rename the file to remove the .v2_2.txt extention - name should be unraid-tunables-tester.sh

[*]Copy the file onto your flash drive (I put it in the root of the flash for convenience)

[*]Check to see if the file is executable by running ls -l in the install directory:

    -rwxrwxrwx  1 root root    21599 2013-08-22 12:54 unraid-tunables-tester.sh*

[*]If you don't see -rwxrwxrwx (for Read Write Execute) use command chmod 777 unraid-tunables-tester.sh to make it executable

 

 

RUNNING THE UTILITY:

The utility is run from the server's console, and is not accessible from the unRAID GUI.  I like to use TELNET and SCREEN to manage my console connections, but use whatever you like best.

 

You should always run this utility interactively.  It is not designed to be something you put in your go file or in a cron job.

 

To run, simply cd to the folder you placed the file (i.e. cd /boot) then run the program (i.e. type: unraid-tunables-tester.sh, or ./unraid-tunables-tester.sh for those that like that convention).

 

Edited 08/22/2013 - Added chmod instructions

Edited 08/23/2013 - Updated to version 1.1

Edited 08/26/2013 - Updated to version 2.0

Edited 08/28/2013 - Updated to version 2.1

Edited 08/28/2013 - Updated to version 2.2

 

CONTINUED IN NEXT POST...

unraid-tunables-tester.sh.v2_2.txt

1

Share this post


Link to post
Share on other sites

RUNNING THE UTILITY:

The utility is run from the server's console, and is not accessible from the unRAID GUI.  I like to use TELNET and SCREEN to manage my console connections, but use whatever you like best.

 

You should always run this utility interactively.  It is not designed to be something you put in your go file or in a cron job.

 

To run, simply cd to the folder you placed the file (i.e. cd /boot) then run the program (i.e. type: unraid-tunables-tester.sh, or ./unraid-tunables-tester.sh for those that like that convention).

 

 

FULLY AUTOMATIC MODE:

Fully Automatic mode is easy to select.  At the main screen, enter Y (uppercase only) to accept the warning prompt, then type FULLAUTO or fullauto as the Test Type.  You will then have to enter Y or y on the Fully Automatic Mode warning screen.

 

At this point, the test is running.  This test will take 2.1 hours to run the FULLAUTO routine, unless your server responds well to low values, in which case the test will be extended by 12 minutes to test extra low values (below unRAID stock values).

 

Also, the FULLAUTO tests very large amounts of memory being allocated to the sd_* tunable parameters.  At the end of the first pass, it has allocated more than 5 times more memory than the stock values allocate.  Considering the stock values were appropriate for servers with only 512MB RAM, these amounts should be safe for servers with 4GB of RAM, but any plug-ins and add-ons you've installed will be competing for that same memory, so beware.

 

Fully Automatic mode makes 2 passes. The first pass tests md_sync_window values ranging from 512 to 2944 with an Byte Increment of 128 and a Test Length of 3 minutes (somewhere between a Normal and a Thorough Test Type).  The fastest speed is recorded and the second pass is centered on the corresponding md_sync_window with a test range of 120 (starting 120 values below the fastest md_sync_window) and a Byte Increment of 8.  The second pass has a Test Length of 4 minutes (a Thorough level test). 

 

Interestingly enough, the FULLAUTO mode revealed to me that my server runs best with md_sync_window values around 2668, significantly higher than the stock unRAID value, and also more than double any value I had tested manually (back before I wrote this utility).  My new value is about 4MB/s faster than my old value of 1024.  I never would have discovered this improvement without the utility.  I have yet to run a Parity Check, so I can't say for sure that my times will be reduced, and I don't know about long term server stability, so I will have to report back on that in the future.

 

MANUAL MODE:

Manual Mode isn't so much as a mode, but rather just the selection of various options you are presented with when you don't select FULLAUTO.

 

TEST TYPE:

You first option (on the same FULLAUTO selection screen) is the Test Type.  The tests are listed in order from quickest [V for Veryfast) to slowest [E for Extreme].  These settings control how long the Parity Check is allowed to complete before it is cancelled and restarted with the next set of test values. 

unRAID Tunables Tester v1.0 by Pauven

Please select what type of test you would like to perform:

     (V) Veryfast  - Tests 0.02% of your array, produces inaccurate results
     (F) Fast      - Tests 0.10% of your array, produces rough results
    *(N) Normal    - Tests 0.25% of your array, produces good results
     (T) Thorough  - Tests 1.00% of your array, produces great results
     (E) Extreme   - Tests 4.00% of your array, produces accurate results
     (FULLAUTO)    - 1.5x Length as Full Parity Check! Fantastic results
     (C) Cancel

Enter V F N T E or FULLAUTO to continue or C to cancel:

For your very first test, I would suggest using Veryfast so you can get comfortable with how the test runs. 

 

The downside with this quicker test types is that they are less accurate.  Minor server hiccups can cause the results to skew badly.  Longer tests collect more real data and squelch this noise.  Also, quicker tests simply have a smaller sample from which to extrapolate performance.  The Veryfast test stops the Parity Check at 0.02% complete, which takes only a couple seconds on my server.  That's not much data to base decisions on, but this comes in handy for performing a quick scan of all test values to see if there is a range you want to hone in on.

 

Conversely longer tests take... longer.  Sometimes painfully so.  The longest test is the Extreme, which allows the Parity Check to get to 4% for each set of test values.  On my server, it takes about 15 minutes per test.  You get very accurate results, but you need to be picky about how many different values you test.

 

BYTE INCREMENTS:

The Byte Increment value directly affects how many individual tests are run.  The Byte Increment is the interval of values that will be tested for md_sync_window.

unRAID Tunables Tester v1.0 by Pauven

Please select what tunable value byte increment you would like to test with.

NOTE: Smaller increments will cause additional test iterations to run.
      For example, an increment of 128 will run 14 tests, while an increment of
      64 will run 27.  Each smaller increment will run double the number of
      tests as the one before it. An increment of 1 will run 1665 tests.
      Increments below 64 are not recommended, but have been made available to
      you in case Curious George is your hero and the phrase 'Curiosity Killed
      The Cat' means nothing to you.

CAUTION: You may only want test with small intervals when running a (F)ast type
         test, otherwise this test may take days...

    *(1) 128 bytes ( 14 Test Iterations)   (5)   8 bytes ( 209 Test Iterations)
     (2)  64 bytes ( 27 Test Iterations)   (6)   4 bytes ( 417 Test Iterations)
     (3)  32 bytes ( 53 Test Iterations)   (7)   2 bytes ( 833 Test Iterations)
     (4)  16 bytes (105 Test Iterations)   (   1 bytes (1665 Test Iterations)
     (C) Cancel

Enter 1-8 to continue or C to cancel:

For example, if you select the default Byte Increment of 128, md_sync_window values of 384, 512, 640, etc. will be tested - each new test value is 128 higher than the previous.

 

A Byte Increment of 1 will test values 384, 385, 386, etc. - each new test value is 1 higher than the previous.

 

For your very first test, I would suggest using the default 128 so you can get comfortable with how the test runs. 

 

Remember that smaller increments mean more tests, which means longer overall testing time.  It would be unwise to combine an Extreme Test Type with a 1 Byte Increment, as that could take a few weeks to run!

 

The downside to larger increments is that large ranges of values go untested, and one of those values may be the sweet spot for your server.

 

I like to try to zero in by first running a Fast or Normal Test Type with a medium large increment, like 64.  Looking the the results, I might see a smaller range I want to test further, so I might run a Thorough or Extreme test with a smaller increment, but only over a smaller range.

 

START POSITION OVERRIDE:

By default, the test is designed to start at a md_sync_window value of 384 bytes (the unRAID stock value).  This is fine for quicker Test Types and larger Byte Increments, but once you've run your preliminary tests you might want to zoom on a particular value range.  For example, my server responded very well to values around 1280, so I might set the Start Postion Override to 1152, skipping over all the test values from 384 to 1151.

Would you like to override the STARTING position of this test?

This is helpful if you have run previous tests at faster speeds and larger
byte increments, and you would now like to hone in on a smaller test range.

The default starting position is the unRAID stock md_sync_window of 384 bytes.

    *(N) 384 bytes    (5) 1024 bytes    (10) 1664 bytes    (15) 2304 bytes
     (1) 512 bytes    (6) 1152 bytes    (11) 1792 bytes    (16) 2432 bytes
     (2) 640 bytes    (7) 1280 bytes    (12) 1920 bytes    (17) 2560 bytes
     (3) 768 bytes    ( 1408 bytes    (13) 2048 bytes    (18) 2688 bytes
     (4) 896 bytes    (9) 1536 bytes    (14) 2176 bytes    (19) 2816 bytes

     (C) Cancel

Enter N or 1-14 to continue, or C to cancel:

 

END POSITION OVERRIDE:

By default, the test is designed to end at a md_sync_window value of 2084 bytes (a somewhat arbitrary value).  This is fine for quicker Test Types and larger Byte Increments, but once you've run your preliminary tests you might want to zoom on a particular value range.  For example, my server responded very well to values around 1280, so I might set the End Position Override to 1408, skipping all the test values beyond that point.

Would you like to override the ENDING position of this test?

This is helpful if you have run previous tests at faster speeds and larger
byte increments, and you would now like to hone in on a smaller test range.

The default ending position of this test is 2048 bytes.

The value you chose must be greater than or equal to 384 bytes.

    *(N) 2048 bytes   (5) 1024 bytes    (10) 1664 bytes    (16) 2432 bytes
     (1) 512 bytes    (6) 1152 bytes    (11) 1792 bytes    (17) 2560 bytes
     (2) 640 bytes    (7) 1280 bytes    (12) 1920 bytes    (18) 2688 bytes
     (3) 768 bytes    ( 1408 bytes    (14) 2176 bytes    (19) 2816 bytes
     (4) 896 bytes    (9) 1536 bytes    (15) 2304 bytes    (20) 2944 bytes

     (C) Cancel

Enter N or 1-14 to continue, or C to cancel:

Combined with my Start Position Override, I've now focused my tests on a much smaller range of values from 1152-1408.  I can now increase my Test Type to a longer test, and/or lower my Byte Increment to a smaller interval to hit more test points.

 

One other use of the End Position Override is to test values beyond 2048.  I've provided options all the way up to 2944 (again, a somewhat arbitrary number but we're getting pretty big and silly at that point).  If there's a need for higher values, I'll consider adding them in the future, but for now I think this is a safe limit.

 

 

MONITORING THE TEST RUN:

Some of these tests can take a long time to run, especially in Extreme mode.  I couldn't imagine waiting 15 minutes for a status update, I would go bonkers!  So I designed the GUI to update every second.  As each test progresses, you can see the current StopWatch elapsed time for the test, as well as the current position in the Parity Check (the same position data you would see in the unRAID GUI).  For the previously completed tests, you can see the tested md_sync_window value, the test duration time, and the calculated MB/s.

 

SAMPLE OF MY SCREEN WHILE RUNNING PASS 2 IN A FULLAUTO TEST:

Test 79 With md_sync_window=2728 Completed in 412.109 seconds at 138.9 MB/s
Test 80 With md_sync_window=2736 Completed in 412.201 seconds at 138.8 MB/s
Test 81 With md_sync_window=2744 Completed in 412.200 seconds at 138.8 MB/s
Test 82 With md_sync_window=2752 Completed in 412.192 seconds at 138.8 MB/s
Test 83 With md_sync_window=2760 Completed in 412.134 seconds at 138.9 MB/s
Test 84 With md_sync_window=2768 Completed in 412.128 seconds at 138.9 MB/s
Test 85 With md_sync_window=2776 Completed in 412.204 seconds at 138.8 MB/s
Test 86 With md_sync_window=2784 Completed in 412.136 seconds at 138.9 MB/s
Test 87 With md_sync_window=2792 Completed in 413.145 seconds at 138.5 MB/s
Test 88 With md_sync_window=2800 Completed in 412.209 seconds at 138.8 MB/s
Test Range Entered - Stopwatch: 342.02s - Current Position: 49288492

 

ABORTING A TEST RUN:

If you need to stop a test run for any reason, the easiest way is to simply press the keyboard dynamic duo CTRL-C (which means cancel here, not copy).

 

Keep in mind that this cancels just the utility program, nothing else.  If the utility was currently running a Parity Check, which is very likely, that was not cancelled.  You could cancel the Parity Check the normal way through the GUI, or type /root/mdcmd nocheck at a command prompt.

 

REVIEWING TEST RESULTS:

After the test run is complete, detailed test results are written to TunablesReport.txt.  Summary test results are presented in the console window, and you are also able to press Y to view a copy of the TunablesReport.txt file right in the console.  The TunablesReport.txt file lives wherever you copied the unraid-tunables-tester.sh utility.  If you need to save any results before running another test, you should rename or move this file, otherwise it will be overwritten by the new test run.

 

Tunables Report from  unRAID Tunables Tester v1.0 by Pauven

NOTE: Use the smallest set of values that produce good results. Larger values
      increase server memory use, and may cause stability issues with unRAID,
      especially if you have any add-ons or plug-ins installed.

Test | num_stripes | write_limit | sync_window |    Time   |    Speed 
--- FULLY AUTOMATIC TEST PASS 1 (Rough - 73 Sample Points @ 0.8% Duration)---
   1  |    1408     |     768     |     512     |  222.345s |  103.0 MB/s 
   2  |    1440     |     768     |     544     |  208.967s |  109.5 MB/s 
   3  |    1472     |     768     |     576     |  195.582s |  117.0 MB/s 
   4  |    1504     |     768     |     608     |  188.421s |  121.5 MB/s 
/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
  55  |     2833    |    1275     |    1275     | 1765.231s |  135.0 MB/s 
  56  |     2835    |    1276     |    1276     | 1766.537s |  135.0 MB/s 
  57  |     2837    |    1277     |    1277     | 1764.631s |  135.1 MB/s 

Completed: 7 Hrs 51 Min 36 Sec.

Recommended values for your server came from Test # 57 with a time of 1764.631s:

     Tunable (md_num_stripes): 2837
     Tunable (md_write_limit): 1277
     Tunable (md_sync_window): 1277

In unRAID, go to Settings > Disk Settings to set your chosen parameter values.

 

Edited 08/29/2013 - Updated the FULLAUTO test description to reflect v2.2.

0

Share this post


Link to post
Share on other sites

Ok, I'm a bit confused.  Granted I have not run the test and that I skimmed through the posts above, I gather it is only tuning for max throughput of a parity check with no regard to the effect on anything else happening with the disk subsystem.  Is it monitoring the IO waits?  How does it simulate external read/write requests while doing the parity check?  The default config is obviously not set up for maximum speed of parity calculations and it is done that way on purpose so that the disk IO queue is not too deep to create excessive delays in responding to external read/write requests during the parity calc.  By maximizing parity calculation speed you sacrifice usability during the parity process.

0

Share this post


Link to post
Share on other sites

Ok, I'm a bit confused. 

Yes.

Granted I have not run the test and that I skimmed through the posts above, I gather it is only tuning for max throughput of a parity check with no regard to the effect on anything else happening with the disk subsystem.  Is it monitoring the IO waits?  How does it simulate external read/write requests while doing the parity check?  The default config is obviously not set up for maximum speed of parity calculations and it is done that way on purpose so that the disk IO queue is not too deep to create excessive delays in responding to external read/write requests during the parity calc.  By maximizing parity calculation speed you sacrifice usability during the parity process.

No.  To all of it.

0

Share this post


Link to post
Share on other sites

Ok, I'm a bit confused. 

Yes.

Granted I have not run the test and that I skimmed through the posts above, I gather it is only tuning for max throughput of a parity check with no regard to the effect on anything else happening with the disk subsystem.  Is it monitoring the IO waits?  How does it simulate external read/write requests while doing the parity check?  The default config is obviously not set up for maximum speed of parity calculations and it is done that way on purpose so that the disk IO queue is not too deep to create excessive delays in responding to external read/write requests during the parity calc.  By maximizing parity calculation speed you sacrifice usability during the parity process.

No.  To all of it.

 

That certainly clears things up, thank you.

 

http://lime-technology.com/forum/index.php?topic=4625.msg42091#msg42091

0

Share this post


Link to post
Share on other sites

Very nicely done Paul =>  I'm going to run it on both of my systems just to see how well I did on setting the tunables.

 

0

Share this post


Link to post
Share on other sites

noob question on installation step 4 you say Make sure the file is executable, use chmod to change permissions if necessary how does one do this, may be worth putting in this the instructions?

0

Share this post


Link to post
Share on other sites

noob question on installation step 4 you say Make sure the file is executable, use chmod to change permissions if necessary how does one do this, may be worth putting in this the instructions?

 

I knew someone would ask...

 

I had to go look it up myself.  I only chmod once in a blue moon.  Luckily when I copy files onto my flash over the network they seem to already have the correct permissions, so that probably holds true for most unRAID users.

 

I updated the instructions to clarify.

0

Share this post


Link to post
Share on other sites

noob question on installation step 4 you say Make sure the file is executable, use chmod to change permissions if necessary how does one do this, may be worth putting in this the instructions?

 

I knew someone would ask...

 

I had to go look it up myself.  I only chmod once in a blue moon.  Luckily when I copy files onto my flash over the network they seem to already have the correct permissions, so that probably holds true for most unRAID users.

 

I updated the instructions to clarify.

 

Hehe

I'm a total Linux noob but i have learnt so much playing with unraid and the command line, thanks for the update.

0

Share this post


Link to post
Share on other sites

noob question on installation step 4 you say Make sure the file is executable, use chmod to change permissions if necessary how does one do this, may be worth putting in this the instructions?

 

chmod +x filename

0

Share this post


Link to post
Share on other sites

you should make it a package of some sort

 

for which package manager? webGui? unmenu? SF?. ...

0

Share this post


Link to post
Share on other sites

webGUI.

 

There will be only one as the song goes :)

 

you should make it a package of some sort (people took this serious)

 

Certainly as the vast majority of unraid users dont spend much time at the command line.

0

Share this post


Link to post
Share on other sites

webGUI.

 

There will be only one as the song goes :)

 

doubtful... I don't see unmenu's solution going away anytime soon. Then the newcomer boxcar has potential...

0

Share this post


Link to post
Share on other sites

are we supposed to run this with all our plugins running or on a clean unraid install, aka safe mode?

 

 

0

Share this post


Link to post
Share on other sites

are we supposed to run this with all our plugins running or on a clean unraid install, aka safe mode?

 

Thats a little open-ended.. if the plugin installed isnt doing anything..then it should be fine. If the plugin is going to start generating cpu cycles/writing data.. then yes you prob dont want to be using them as it could skew the results.

 

So, to make sure you limit the outside variables you prob should just run this in safe mode and with the array stopped.

 

Pauven prob can answer best or confirm my statement.

 

0

Share this post


Link to post
Share on other sites

Well crap!  The workstation I ran this from blue-screened last night and this didnt finish.  I'll run it again on the console next week. I need Plex up for the next few days.

0

Share this post


Link to post
Share on other sites

I think we are putting the cart before the horse with any talk of making this utility a package or plug-in.  I haven't even seen reports of positive results by using the utility yet.

 

are we supposed to run this with all our plugins running or on a clean unraid install, aka safe mode?

 

I advise safe mode, though I've been testing with unMenu, cachedirs, screen, ups, and a few others without issues.  I've also tested some very high values (higher than what I've released in v1.0 of the utility) without issue.

 

The situation that I am concerned about is that you have a plug-in that is writing data to your array (or you are writing data to your array manually) and the server crashes due to an out of memory condition which resulted from the very high memory values being tested.  This could result in some data lost (probably limited to what was being written).

 

This test is accomplishing two goals:  primarily finding the set of values that produces unhindered performance; but secondly helping to identify any limits to how high these values can go (by causing out of memory errors) - each server may be different.  Everyone needs to remember it isn't exactly safe to be writing data in the middle of a test - not to mention that inconsistent background reads/writes may skew the results.

 

So the best advice is to avoid allowing anything to write data to your array during these tests.  After the tests are done and you've selected your new values, I still recommend caution especially if the new values are significantly higher than unRAID stock values.  I would start of with lots of reads and reads while running a non-correcting parity check - basically try to crash your server by doing a lot of things that read simultaneously.  If that goes well, introduce some writes into the equation.

 

Ultimately you are responsible for your server's stability.

 

So, to make sure you limit the outside variables you prob should just run this in safe mode and with the array stopped.

 

The array has to be started.  The utility is running a whole bunch of Parity Checks, after all.  Can't do that with the array stopped.

 

-Paul

 

0

Share this post


Link to post
Share on other sites

Well crap!  The workstation I ran this from blue-screened last night and this didnt finish.  I'll run it again on the console next week. I need Plex up for the next few days.

 

Hopefully you mean your desktop crashed and not your unRAID server, right?  If you lost your connection during the test, that probably means two things:  a Parity Check was still running (feel free to cancel it) and the last set of tested values are still in use.

 

You can still see the accumulated results in the TunablesReport.txt file, as it is written to as the test progresses.  This will also give you a clue as to what set of values was being tested.  If you want to get back to your normal values, there are multiple ways, but the easiest is probably just to restart your server.

 

I highly recommend using screen, especially if you are using a remote connection like Telnet.  After you log onto the server, you run screen, and then you can run one or more console windows through screen.  If you get disconnected for any reason, you telnet back onto the server and run a screen -r to reconnect.  Everything remained running while you were disconnected.

 

I use screen when doing pre-clears.  I'll telnet to the server, open up several screen console windows, start up multiple pre-clears, then close my telnet connection.  I then monitor everything through unMenu's MyMain status page, which shows pre-clear progress.

 

-Paul

0

Share this post


Link to post
Share on other sites
basically try to crash your server by doing a lot of things that read simultaneously.  If that goes well, introduce some writes into the equation.

 

When expanding these values to very high limits, I found regular reads/writes alone would not create an OOM condition.  I would have to do that while running a full "deep" find / -ls >/dev/null down the whole array.  It depends on how many files you have on the filesystem.  cache_dirs will also help reveal a problem if you have a large number of files.

0

Share this post


Link to post
Share on other sites

Well crap!  The workstation I ran this from blue-screened last night and this didnt finish.  I'll run it again on the console next week. I need Plex up for the next few days.

 

Hopefully you mean your desktop crashed and not your unRAID server, right?  If you lost your connection during the test, that probably means two things:  a Parity Check was still running (feel free to cancel it) and the last set of tested values are still in use.

 

You can still see the accumulated results in the TunablesReport.txt file, as it is written to as the test progresses.  This will also give you a clue as to what set of values was being tested.  If you want to get back to your normal values, there are multiple ways, but the easiest is probably just to restart your server.

 

I highly recommend using screen, especially if you are using a remote connection like Telnet.  After you log onto the server, you run screen, and then you can run one or more console windows through screen.  If you get disconnected for any reason, you telnet back onto the server and run a screen -r to reconnect.  Everything remained running while you were disconnected.

 

I use screen when doing pre-clears.  I'll telnet to the server, open up several screen console windows, start up multiple pre-clears, then close my telnet connection.  I then monitor everything through unMenu's MyMain status page, which shows pre-clear progress.

 

-Paul

 

Yes, my Windows 7 workstation crashed. Its never blue screened before.  Looks like I have some work to do this weekend to see why.

 

I usually run Pre-clears with Screen, but I figured this would take no more than 18 hours, and much of it would have been overnight.  I went ahead and let the parity check continue to run.  Its almost done and it appears to be running much faster than before.  I will know for sure when its done and I can calculate the speed.

 

I will run this again, probably on Sunday night.

0

Share this post


Link to post
Share on other sites

can this script check how much memory is free and back out of what its doing if memory is coming dangerously close to running out?

0

Share this post


Link to post
Share on other sites

I have updated the utility to version 1.1 (see the main post) and I highly recommend upgrading to this new version.

 

New Features in v1.1:

  • I found an issue where some results would have  a 1 second variance in the reported time, leading to false measurements.  This was caused by a sleep statement that I had in the code which was sleeping 1 second between samples.  I decreased the sleep time during the test run to 0.001 seconds (1 millisecond) so the accuracy is 1000x higher.  This has dramatically improved the quality of the results.
  • Originally I was choosing the recommended result by which test has the lowest time.  The problem with this approach is that a variance of 1 ms was enough to cause bigger values to be recommended even though there was no real-world benefit.  I changed my logic to compare the MB/s result instead of the elapsed time, choosing the lowest test number that has the highest speed, which should now represent best possible value.
  • After the test is run, I now report how much memory the recommended values will consume on your server.  This is a complex calculation that takes into account the number of stripes, the highest drive assigned to your array (not the number of drives) and the memory footprint of each stripe.
  • I added the option to go ahead and apply the recommended settings to your server. This is a non-permanent change, as a reboot will go back to your normally configured values.
  • I also added the option to write the recommended settings to your disk.cfg file. This makes the change permanent, so it will apply after reboot.
  • I extended the FULLAUTO test range all the way up to 2944.  I have seen interesting results in the 2600-2800 range, so I figured 'why not!'.
  • I added a (M)ANUAL OVERRIDE option on the Start/End Override screens. I did this because I interrupted a FULLAUTO test in the third pass, and wanted a way to manually restart where it left off.  Also, I allowed manually entered values up to 5888 for anyone that is crazy enough to try it.  2944 is as high as I've tried, so I have no idea if those super high values are okay to play with.
  • I polished the formatting of the data that is produced, both during the test and in the TunablesReport.txt file.
  • I also made a few minor tweaks to the user interface.

 

You can download the file from post #1:  http://lime-technology.com/forum/index.php?topic=29009.msg259087#msg259087

0

Share this post


Link to post
Share on other sites

can this script check how much memory is free and back out of what its doing if memory is coming dangerously close to running out?

 

While I like that idea, I'm not sure it is feasible.

 

When I was originally performing md_* tuning, I intentionally tried to get my server to run out of memory.  I cranked up the tunables to high values, ran a non-correcting parity check, pre-cleared multiple drives at the same time, and read/streamed multiple files from the server all at once.

 

The problem was that even though memory got very low, and the server got very slow, it never actually crashed.  Credit where credit is due, unRAID is pretty solid.

 

While I've certainly seen many reports of people complaining about Kernel OOPS and Out of Memory issues when increasing the md_* tunables, from my observation every person who complained was also running one or more plug-ins like Plex and SF.

 

So the problems with trying to identify that the server is running out of memory are that 1) I've seen what I thought was dangerously low, and it wasn't actually dangerous, and 2) Plug-in developers need to be responsible for their own apps and how they utilize memory.  There are too many apps for me to try and figure out how much memory each one needs to run without crashing.

 

I just added a feature to v1.1 (now available!) that reports how much memory the recommended settings will consume.  The amount of memory consumption really isn't that bad, considering how much memory is available.  Here's some examples:

 

Stock (md_num_stripes=1280) with  3 drives:    15 MB

Stock (md_num_stripes=1280) with  7 drives:    35 MB

Stock (md_num_stripes=1280) with 24 drives:  120 MB

 

My Server:

Tuned (md_num_stripes=5968) with 24 drives:  560 MB

 

Since my server has 4GB, I don't see really any problem with allocating an extra 440MB to unRAID for maximizing performance.  Since unRAID is sized out of the box for servers with 512MB of memory, giving it an extra 440MB still leaves a good 3GB of RAM for plug-ins and add-ons. 

 

And once unRAID goes 64 bit, we won't even have to think about this anymore.

0

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now

Copyright © 2005-2017 Lime Technology, Inc. unRAID® is a registered trademark of Lime Technology, Inc.