brainbone

unraid_notify 2.55 [01-01-2010]: Email notifications for unRAID status

Recommended Posts

Attached is a modified version of the excellent email notification script from kenshin and Joe L. on this thread

 

NOTE:  If you are using unraid_notify with Unraid 4.5beta-12 or greater, you must update to unraid_notify 2.50 or greater.

 

Features:

 

- Sends optional periodic email notification containing SMART health status, disk temperature,  and the contents of /proc/mdcmd

 

- Sends immediate email notification on error condition (array fault, drive too hot, failed SMART health)

 

- Runs user definable command (like powerdown) if disk temperature crosses a user definable threshold.

 

- Uses SMTP AUTH LOGIN when sending emails.

 

- Supports SMTPS / SSL / Secure SMTP if socat package is installed. (Allows use of Gmail, etc.)

 

- Now contains "mail" command. (see: this thread for details)

 

- No longer spins down disks (lets unRAID handle this instead)

 

 

Change Log:

 

# Changes: [2.55] 12-31-2009, (unRAID user: stchas):
# - Updated "mail"command to v0.4, adding ErrorRcptTo as alternate RcptTo; 
#   made misc bug fixes to correctly enable socat/SSL logic, fixing
#   the same issue 2.54 fixed in unraid_notify.
#
# Changes: [2.54] 12-28-2009, (unRAID user: brainbone):
# - Corrected issue with SSL/TLS (secure SMTP) introduced in 2.52.
#   If you had trouble with gmail or others using secure smtp (socat),
#   it should be resolved in 2.54.
#
# Changes: [2.53] 12-17-2009, (unRAID user: brainbone):
# - Added /root/.forward support for mail command, allowing
#   mail sent to "root" to be forwarded to a different address
#
# Changes: [2.52] 12-16-2009, (unRAID user: brainbone):
# - bashmail changed to now use CRLF instead of LF for line termination
#   making bashmail/unraid_notify compatible with many more
#   mail servers.
#
# Changes: [2.50] 11-18-2009, (unRAID user: brainbone):
# - Back-ported changes by jbuszkie (2.31 & 2.32)
# - Spin-down function disabled by default (unRAID handles this now)
# - Changed how config file is read (copy to ram disk)
#
# Changes: [2.40] 01-27-2009, (unRAID user: brainbone):
# - Further fix of multiple recipients bug
# - Fixed tab characters not showing up in sent mail
#
# Changes: [2.32] 05-01-2009, (unRAID user: jbuszkie):
# - Added support for cache disks
#
# Changes: [2.31] 03-04-2009, (unRAID user: jbuszkie):
# - Added support to not send status if one or more disks are spun down
# - Send report if no disks are spun up.  If True then Status report will onlt be sent if one or more disks are not
#   spun up.  If all disks are not spun up, then no report will be sent if this variable is set to true.
# - New variable added to cfg - NoReportIfAllSpunDown
#
# Changes: [2.30] 01-21-2009, (unRAID user: brainbone):
# - Changed bashmail to use EHLO instead of HELO to fix compatibility
#   problem with earthlink
# - Fixed generation of "To:" field in header to address compliance with
#   RFC2822
# - Fixed bug when using multiple recipients (more than one would not
#   work)
#
# Changes: [2.20] 10-01-2008, (unRAID user: brainbone):
# - Added support for SSL (secure smtp)
#   Note: SSL requires socat
# - Released under GPL
#
# Changes: [2.11] 09-21-2008, (unRAID user: brainbone):
# - Changed the field unraid_notify uses for getting disk
#   temperature from smartctl to fix incorrect disk temp 
#   being reported on non-seagate drives. 
#   (thanks to Joe L. & olympia)
#
# Changes: [2.10] 09-18-2008, (unRAID user: brainbone):
# - Added "SpinDownTime" in unraid_notify.cfg
# - unraid_notify will now monitor disk activity using vmstat -d
#   and spin down disks after n min. of inactivity.
#   This is needed because the default mechanism unRAID uses for
#   spin down is defeated by the way unraid_notify scans for
#   disk status
#
# Changes: [2.00] 09-18-2008, (unRAID user: brainbone):
# - Removed need for socat or netcat (thanks to WeeboTech)
# - added "bashmail" (a replacement for my "socatmail" in 1.00)
# - Added Smart Health Check (more thanks to WeeboTech)
#
# Changes: [1.00] 09-11-2008, (unRAID user: brainbone):
# - Changed script name to "unraid_notify"
# - TCP comunications moved to socat instead of netcat allowing for more
#   robust communication with the mail server.
# - Added SMTP AUTH (AUTH LOGIN)
# - Moved parameters to /boot/config/unraid_notify.cfg, added some
#   additional user configurable parameters (see unraid_notify.cfg)
# - Changed email message header to be more RFC 2822 compliant
#   adding greater compatibility with MUAs
# - Moved SMTP communications to socatmail
# - Changed SMTP envelope communications to be RFC 2822 compliant
#   adding greater compatibility with more strict SMTP servers
# - Added checking and reporting of disk temperature
# - Script now runs more like a daemon (enless loop) instead of
#   running from crontab.  This allows better granularity for scanning
#   operations, and faster response to error conditions.
#   "unraid_notify start" will start this mode.
# - Execute external command based on disk temp. threshold
# - Built slackware package to ease installation and distribution

 

Instructions:

 

unraid_notify 2.55
==================
To use/install unraid notify:

1) If you will be using a Secure SMTP (SSL), like smtp.gmail.com,
  download the following package:

  http://repository.slacky.eu/slackware-12.1/utilities/socat/1.7.0.0/socat-1.7.0.0-i486-2bj.tgz

2) Place the following attached file on your unraid flash drive:

  unraid_notify-2.55-noarch-unRAID.tgz (the file included in the unraid_notify .zip)
  socat-1.7.0.0-i486-2bj.tgz (if you need Secure SMTP support)

  The recommended location for these files is in /package
  If the directory "package" does not exists in the root of your flash drive, create it.

3) Edit the attached unraid_notify.cfg file and copy to the "config" folder on your flash drive.

4) Add the following to your config/go script:

  installpkg /boot/package/socat-1.7.0.0-i486-2bj.tgz (if you will be using Secure SMTP)
  installpkg /boot/package/unraid_notify-2.54-noarch-unRAID.tgz
  unraid_notify start

  (Note:  Replace "/boot/package/" with the actual path to where you put the files in
   step 1.  If you put them in the root of your flash drive, use "/boot/".  If
   you put them in a folder called "package", use "/boot/package/".)

5) Restart your unraid server, or type "unraid_notify start"
  at your unraid telnet prompt to start unraid_notify.

6) If you are having trouble with unraid_notify, telnet into
  unraid and try "unraid_notify -d" to debug the smtp session.
  Double check your settings in unraid_notify.cfg


mail v0.4 (thanks to "mikep" and "jbuszkie")
============================================
To use "mail" (now included in unraid notifiy package).

Examples:

 cat filename.txt | mail
 echo 'this is a test' | mail
 echo 'this is a test' | mail -s 'ATTENTION'
 echo 'this is a test' | mail -s 'ATTENTION' someone@domain.com


Commands:

 -s --subject : Subject of mail
 --rcpt : recipient
 --help

 

FAQ:

 

Q: When I download socat with internet explorer, it always has a ".gz" extension, not a ".tgz" one.

 

A: Use a different browser, or download socat directly on your unraid server in a telnet session using:

wget -P /boot/package http://repository.slacky.eu/slackware-12.1/utilities/socat/1.7.0.0/socat-1.7.0.0-i486-2bj.tgz

 

Q: How do I configure unraid_notify to only send emails when there is an error condition?

 

A: In unraid_notify.cfg, comment out the "RcptTo = " line (Add a "#" in front, so it reads "# RcptTo" ...).  Enter any email addresses that should receive error notifications to the "ErrorRcptTo" line.

 

Q: After setting up unraid_notify, I use "unraid_notify -d" to check why things aren't working, and I get an "AUTH not available".  What can I do?

 

A: Your mail server likely doesn't require a user name and password.  Leave the SMTP user name and password blank in unraid_notify.cfg.

 

Q: How do I get unraid_notify to make Belgian waffles?

 

A: Waffle production is not supported at this time.

 

 

Notes:

 

- Use at your own risk.

 

- Many thanks to Joe L., kenshin and jbuszkie for their contributions, and the code I lifted from them.

 

Share this post


Link to post
Share on other sites

Cool!!

 

Do you do a smart health check too?

 

What I would like to see is removable of the netcat or socat dependency.

 

Bash is somewhat capable of network I/O internally.

 

See the following scriplet.

(although it does not work well on emhttp, so try connecting it somewhere else (I had to)

 

#!/bin/bash 

host=127.0.0.1
host=192.168.1.253
port=80
exec 3<> /dev/tcp/${host}/${port}
echo "GET /index.html HTTP/1.0" 1>&3
echo 1>&3

while read 0<&3; do echo $REPLY; done

 

See link to see where I got this from

http://shudder.daemonette.org/source/BashNP-Guide.txt

 

and more

http://blogmag.net/blog/read/49/Network_programing_with_bash

Share this post


Link to post
Share on other sites

brainbone

 

Nice work.  I'll install and try it out later today when I get a moment free.  I like the idea of the config file a lot.

 

If it allows connection to SMTP through a secure port, it will help those with only "secure" mail servers.

 

One question... for SMPT AUTH communications, is the config file populated with cleartext, or base64 encoded text for login and password?

 

Only issue I can think of that might arise from running as a daemon vs. running from cron would be if there was a memory leak in bash itself.  That is pretty unlikely, but you never know, so keep an eye open on the process size in the "ps" command.

 

WeeboTech,

I did not know bash was capable of network I/O too.  Cool...  I can have fun with that.  ;)

 

Joe L.

Share this post


Link to post
Share on other sites

If it allows connection to SMTP through a secure port, it will help those with only "secure" mail servers.

I believe this would require OpenSSL and stunnel, so at this point, no.  Something I would like to do though.  At the very least, I'm sure a number of gmail users would appreciate it.

 

One question... for SMPT AUTH communications, is the config file populated with cleartext, or base64 encoded text for login and password?

The username/password in the config file is clear text.  (Base64 conversion happens while communicating with the smtp server)

 

Only issue I can think of that might arise from running as a daemon vs. running from cron would be if there was a memory leak in bash itself.  That is pretty unlikely, but you never know, so keep an eye open on the process size in the "ps" command.

Hopefully this won't be a problem.   I suppose moving to a 1 min cron wouldn't be too difficult if necessary.

 

Do you do a smart health check too?

Only checking temp. at this point, using:

smartctl -d ata -A /dev/x

and grabbing the 4th field from the "Temperature_Celsius" record.

I suppose all I'd need to add is "smartctl -d ata -H /dev/x" and look for FAILED.

 

 

What I would like to see is removable of the netcat or socat dependency.

Ideally, I would too.

 

Bash is somewhat capable of network I/O internally.

 

See the following scriplet.

(although it does not work well on emhttp, so try connecting it somewhere else (I had to)

 

I was having trouble getting more than a single request/response when I initially looked into going this direction.  I'll look over the links you've supplied and see if I can make progress.  Thanks!

 

 

Share this post


Link to post
Share on other sites

 

Exactly what I needed to get rid of socat!  Thank you!

 

Seeing that today is my 10th wedding anniversary...  I'll have to wait to implement this. (and repeat this to myself every time I begin to think about it)

 

If you get rid of socat, odds are you will need something to do the base64 encoding of SMTP AUTH login/password. 

To tempt you a tiny bit to get coding... attached is a base64 encoder and decoder... written in awk. These "awk" routines might just provide the functions needed.  I did not write them, but they were available without restrictions. 

Now, awk has TCP/UDP networking built in too, too..., apparently, a not too well known feature. 

So, you have tons of possibilities.

 

Joe L.

PS.  Don't forget a card, and perhaps a gift...  10 years is something to celebrate, even as you think about using exec to open up alternate file descriptors to TCP/IP ports... :-[

 

Share this post


Link to post
Share on other sites

If you do spend the time to put the network I/O in bash.

Could you possibly consider separating out the actual mail logic to a separate executable.

 

I.E. Create a  mailto script that all it does is send out the mail request.

This way the monitor logic just does monitoring and the mail logic just does mailing.

 

My suggestion would be to follow the syntax of the mail command

 

i.e.

 

mailto -s "subject" user@destinationhost.com

 

anything that follows via stdin is sent out.

 

 

 

 

The reason I suggest this approach is to handle the possibility of using a real mail facility without having to rip out the mail logic in the monitor script.

I do have packages to install exim and the mail command.

Which works very well.

 

 

 

 

Share this post


Link to post
Share on other sites

If you get rid of socat, odds are you will need something to do the base64 encoding of SMTP AUTH login/password.

 

unRAID seems to already have the base64 command installed, so I've been using that.  socat was basically just taking that place of /dev/tcp/.  With the example provided by WeeboTech, i was able to change my previous smtp MTA script (I called it "socatmail") over to /dev/tcp with very few changes.

 

But thank you for those awk scripts!  I'm always dumbfounded by what awk seems to be capable of.

 

PS.  Don't forget a card, and perhaps a gift...  10 years is something to celebrate

 

Absolutely.... but I couldn't resist making the changes to the script ;)

 

If you do spend the time to put the network I/O in bash.

Could you possibly consider separating out the actual mail logic to a separate executable.

 

Its kinda half and half.  The external mail script ("socatmail" in v1.00, now "bashmail" in v2.00) is used the following way:

cat message.txt |bashmail --rcpt target@domain.com --from mailfrom@domain.com [--user smtp-auth-username] [--pass smtp-auth-password] --smtp mail.relay.org [--port 25]

 

The email message (message.txt in this case) must contain a header, etc., and is built by the calling script.

 

See the top post for changes made.

 

One problem I'm having is that periodic use of smartctl seems to keep the drives spun up.  I check if the drives are sleep||standby before using smartctl to avoid spinning the drive up, but using smartctl more  often than the spin down timer appears to stop the drive from spinning down.

 

Thanks, both of you, for all your input!

 

Share this post


Link to post
Share on other sites

One problem I'm having is that periodic use of smartctl seems to keep the drives spun up.  I check if the drives are sleep||standby before using smartctl to avoid spinning the drive up, but using smartctl more  often than the spin down timer appears to stop the drive from spinning down.

 

This is a common and known situation. Accessing the SMART data prevents drive from spinning down.

 

From what I have found out, the -H (health check) will work in standby without spinning up the drive.

You may have to use the -n standby command as in

 

smartctl -d ata -n standby -H /dev/sdd

 

This will tell you if the drive is in standby.

 

Also, you can so a -H while the drive is in standby and it will not spin up the drive.

 

By using the smartctl -a you are preventing spindown.

You might be able to get away with this by only doing a healthcheck.

I.E. until Tom changes the spin down logic to be controlled from emhttp instead of the drive's spin down timer.

 

I've tested the -H on a drive that is spun down and it did not spin the drive up.

I have not tested if -H prevents the drive from spinning down.

 

So my suggestion is to temporarily disable the temperature test, and just leave the health test intact, then see what happens.

If (when) Tom goes back to programmed spin down via emhttp, then you can get away with the temperature test if the drive is not in standby.

 

 

 

 

Share this post


Link to post
Share on other sites

This is what I use to check temperature in my "awk" based web-server as pictured in this thread: http://lime-technology.com/forum/index.php?topic=2110.msg19269#msg19269

I invoke hdparm -C first, check if the device is sleeping.  If it is, I do not bother with checking temperature, since as you discovered using smartctl, that just spins it up.

 

Granted, this is "awk" script, but the equivalent shell is really simple for somebody with your background.

 

function GetDiskTemperature(theDisk, the_temp, cmd) {

   the_temp="*"
   is_sleeping = "n"
   cmd = "hdparm -C " theDisk " 2>/dev/null"
   while ((cmd | getline a) > 0 ) {
   if ( a ~ "standby" ) {
       is_sleeping = "y"
   }
   }
   close(cmd);
   if ( is_sleeping == "n" ) {
       cmd = "smartctl -d ata -A " theDisk "| grep -i temperature"
       while ((cmd | getline a) > 0 ) {
           delete t;
           split(a,t," ")
           the_temp = t[10] "°C"
           if ( t[10] >= yellow_temp && t[10] < orange_temp ) {
               the_temp = "<div style=\"background-color:yellow;\">" the_temp "</div>"
           }
           if ( t[10] >= orange_temp && t[10] < red_temp ) {
               the_temp = "<div style=\"background-color:orange;\">" the_temp "</div>"
           }
           if ( t[10] >= red_temp ) {
               the_temp = "<div style=\"background-color:red;\">" the_temp "</div>"
           }
       }
       close(cmd);
   }
   return the_temp
}

 

I did not realize base64 was part of unRAID... cool.

 

Joe L.

Share this post


Link to post
Share on other sites
using smartctl, that just spins it up.

 

Not quite the case all the time, using smartctl prevents it from spinning down.

I've tested with smartctl -a and -H on one of my drives in standby, it did not spin up.

However I do know that frequent smartctl access can prevent the drive from spinning down.

At least that is what I've read and experienced.

 

root@Atlas [1] /mnt/disk1/bittorrent>hdparm -C /dev/sdd 

/dev/sdd:
drive state is:  standby

root@Atlas [1] /mnt/disk1/bittorrent>smartctl -d ata -H /dev/sdd   
smartctl version 5.38 [i486-slackware-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED


root@Atlas [1] /mnt/disk1/bittorrent>smartctl -d ata -a /dev/sdd
smartctl version 5.38 [i486-slackware-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Device Model:     WDC WD10EACS-00ZJB0
Serial Number:    WD-WCASJ0437718
Firmware Version: 01.01B01
User Capacity:    1,000,204,886,016 bytes
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   8
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Thu Sep 18 11:53:42 2008 GMT+4
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
...
...

root@Atlas [1] /mnt/disk1/bittorrent>hdparm -C /dev/sdd            

/dev/sdd:
drive state is:  standby

 

I also did a -a and the drive did not spin up.

 

 

I found this today...

Looks interesting

http://www.j-pfennig.de/LinuxImHaus/ngflushd_1_man.html

 

 

 

 

Share this post


Link to post
Share on other sites

 

However I do know that frequent smartctl access can prevent the drive from spinning down.

At least that is what I've read and experienced.

 

 

All the more reason for unRAID to use the older method. Could explain alot of spin issues people are having.

 

ngflushd  looks very intersting

Share this post


Link to post
Share on other sites

smartctl -d ata -n standby -H /dev/sdd

 

Unfortunately, with the smartctl installed on unRAID, -n standby returns

"UNRECOGNIZED OPTION: n"

 

smartctl -d ata -H /dev/sdd does appear to spin the drive up.

 

I invoke hdparm -C first, check if the device is sleeping.  If it is, I do not bother with checking temperature, since as you discovered using smartctl, that just spins it up.

 

This is exactly what I do.

 

All the more reason for unRAID to use the older method. Could explain alot of spin issues people are having.

 

Forgive my ignorance, but what was "the older method"?

 

Share this post


Link to post
Share on other sites

Curent version of sab sets the spin down timer on the drives. The older method manually forces drives down by keeping its own timer and watching disk usage stats.

Share this post


Link to post
Share on other sites

Ok.

 

I gave unraid_notify 2.10 the ability to monitor and spin down disks using vmstat -d and hdparm -y.  I'm not sure if there will be any problems with doing it this way, but it seems to be working so far.  See top post for details.

 

Also note that I made a mistake on the unraid_notify 2.00 package, and placed the wrong version of the unraid_notify in the package. My apologies to those that may have tried it only to end in frustration.

 

Share this post


Link to post
Share on other sites

Don't forget... 

 

Wife, 10 yr anniversary

 

if [ ! card && ! present && ! dinner ]

then

    ! good

fi

 

Joe L.

(in Feb it will be 35 yrs for us)

 

Share this post


Link to post
Share on other sites

smartctl -d ata -n standby -H /dev/sdd

 

Unfortunately, with the smartctl installed on unRAID, -n standby returns

"UNRECOGNIZED OPTION: n"

 

smartctl -d ata -H /dev/sdd does appear to spin the drive up.

 

 

Hmm, my version is 5.38

 

smartmontools release 5.38 dated 2008/03/10 at 10:44:07 GMT

 

root@Atlas ~>hdparm -C /dev/sde

 

/dev/sde:

drive state is:  standby

 

root@Atlas ~>smartctl -H /dev/sde

smartctl version 5.38 [i486-slackware-linux-gnu] Copyright © 2002-8 Bruce Allen

Home page is http://smartmontools.sourceforge.net/

 

=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: PASSED

 

root@Atlas ~>hdparm -C /dev/sde 

 

/dev/sde:

drive state is:  standby

root@Atlas ~>

 

Share this post


Link to post
Share on other sites
Hmm, my version is 5.38

 

smartmontools release 5.38 dated 2008/03/10 at 10:44:07 GMT

 

I assume you compiled that in your custom environment?  Could you attach it for the rest of us?

Share this post


Link to post
Share on other sites

Thanks WeeboTech.

 

I downloaded both, and discovered I already had both (never installed and forgotten!), so went searching, and discovered you had already covered the subject in this post, with more detailed instructions 2 posts down by Brian, and both mentioned in the Best of the Forum's Addon section

 

Guess I should have checked Best of the Forums first.

Share this post


Link to post
Share on other sites

Wow, is this script still able controlling spin down timer on a per disk basis?

 

I don't mean different spin down timers for each disks, but what happens if I use one disk continously? Is other disks are going to sleep in the meanwhile?

Share this post


Link to post
Share on other sites

but what happens if I use one disk continously? Is other disks are going to sleep in the meanwhile?

 

The script monitors I/O on a per physical disk basis.  If no I/O happens to the other disks within the timeout your specify, they will be spun down.

 

Share this post


Link to post
Share on other sites

 

The script monitors I/O on a per physical disk basis.  If no I/O happens to the other disks within the timeout your specify, they will be spun down.

 

 

Great, in the end it seems that all my spin down issues are gone (cross my fingers).

 

But something is not OK with the temperatures. I get mail notifications about disk overheats, and all my disks are above 100 Celsius, which is obviously not the case (morevover, the relation is wrong as well, as my samsung drive is the coolest):

 

This message is a status update for unRAID Tower

-----------------------------------------------------------------

Server Name: Tower

Status: Parity Disk Overheat! 116°C (DiskId: ata-WDC_WD10EACS-00D6B0)

Status: Disk 1 Overheat! 116°C (DiskId: ata-WDC_WD10EACS-00D6B0)

Status: Disk 2 Overheat! 151°C (DiskId: ata-SAMSUNG_HD501LJ)

Status: Disk 3 Overheat! 109°C (DiskId: ata-WDC_WD4000YS-01MPB1)

Status: Disk 4 Overheat! 107°C (DiskId: ata-WDC_WD4000YS-01MPB1)

Date: Sun Sep 21 13:58:09 GMT 2008

 

Is there anything I colud misconfigured, which cause this?

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


Copyright © 2005-2017 Lime Technology, Inc. unRAID® is a registered trademark of Lime Technology, Inc.