A framework for notifications a.k.a sending emails, boxcar, notifo, pushover etc


Recommended Posts

I am not sure these ideas are mutually exclusive. All we are saying is that the process sends alarms into syslog with the relevant criteria and from there the notifier(s) do their job.

 

In this respect any number of monitors can do any number of checks as long as they push the alarm into syslog as a final action.

Link to comment
  • Replies 72
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Posted Images

I am not sure these ideas are mutually exclusive. All we are saying is that the process sends alarms into syslog with the relevant criteria and from there the notifier(s) do their job.

 

In this respect any number of monitors can do any number of checks as long as they push the alarm into syslog as a final action.

 

That's works well. No issue there.

 

So according to the previous questions about a framework were.

How will you store it (if even)? Where does the data go?  SYSLOG

 

What is it you want to capture?

How will you capture it? (monitor scripts)

How will you warehouse, normalize or utilize the event data?

How will you report it in a configurable manner?

 

Now define the API of the message to trigger the notification event.

IE. Normalize the data.

Link to comment

The idea of two simple, configurable questions, sounds good.

 

i.e. "What do you want to monitor?"

 

and  "How do you want to be notified?"

 

The concern I have with "any number of monitors" being active is their potential impact on the system's performance.  For example, I believe SmartMon has to spin up a disk to get a SMART report .. so if there's a monitor that's constantly looking for changes in SMART data, then the drives would never spin down.  Or if the notifications depend on a scraper, this then adds one more requirement for coordination between any monitoring apps and the notification app (which seems more involved than a simple "notification API" that the monitor might call).

 

I DO like the idea of this being nicely integrated into UnRAID, however.    While I don't want to see a LOT of notifications, I'd definitely like to have a very simple method of setting up basic e-mail notification for key events.    And as WeeboTech noted, some folks would like to see a lot more notifications -- monitoring just about ever array "event".

 

Link to comment

The idea of two simple, configurable questions, sounds good.

 

i.e. "What do you want to monitor?"

 

and  "How do you want to be notified?"

 

This is from a user only perspective.

If we are specifying a framework for notifications the other questions come into play.

They have to be answered otherwise things wont work when new additions are added.

 

The concern I have with "any number of monitors" being active is their potential impact on the system's performance.  For example, I believe SmartMon has to spin up a disk to get a SMART report .. so if there's a monitor that's constantly looking for changes in SMART data, then the drives would never spin down.  Or if the notifications depend on a scraper, this then adds one more requirement for coordination between any monitoring apps and the notification app (which seems more involved than a simple "notification API" that the monitor might call).

 

This boiled down to warehousing the data.

if it's done right all states of a monitor are centralized.  be it syslog with a specific message format, flat files, a database, an in core memory array, etc. etc.

 

There's no need for umpteen smart monitors.

One program that can gather the smart data, cache it and let any other monitor review it, then post a message via API (or framework protocol).

 

The point of my other post. I.E. API was a formatted message in a certain way that defines how a message reaper/handler/delivery agent will know what to do.

 

I DO like the idea of this being nicely integrated into UnRAID, however.    While I don't want to see a LOT of notifications, I'd definitely like to have a very simple method of setting up basic e-mail notification for key events.    And as WeeboTech noted, some folks would like to see a lot more notifications -- monitoring just about ever array "event".

 

As ideas are discussed the spec gets flushed out.

 

As far as smartmon spinning up drives, I believe there is a way to say, do not query smart data if the drive is in standby.  Another consideration is that everytime you refresh the main screen, I believe Tom gets the data from smartctl. If this data were cached on the ram drive, then an external monitor could query this data rather then doing the smartctl command itself.

 

I tested the other day and

smartctl -nstandby -a /dev/sd?

worked for me.

Link to comment

I tested the other day and

smartctl -nstandby -a /dev/sd?

worked for me.

 

Good to know.  I always check the SMART status with UnMenu ... and it forces the drives to spin up when you ask for a SMART report.  [No big deal since I only do this about once/month and do it manually -- but if there was a monitor that was constantly checking, it'd definitely be preferable if it did NOT bother when the drives were spun down]

 

Link to comment

One thing we should clear up is where notifications stop and reports start.

 

I consider these two different things:

 

notifications being alarms (i.e. something bad is happening go do something). These should be rare events with short one line type info

 

Reports i.e. here is you usage over the last month. These are periodic and likely will be a page of info

 

This is important as when we started here we (or at least I) intended to discuss notifications (as in alarms) but we have on several ocasions seen mentions of reports.

 

Not all types of notification lend themselves to reports and vice versa.

 

I suggest that reports are limited to email notification types and we leave it at that

 

 

Link to comment

You might not want to go too crazy with the syslog thing.

 

monit may be a simpler monitor and notification framework then nagios and may suit us.

 

From what I'm seeing I can email a message directly and run an external program.

http://mmonit.com/monit/documentation/monit.html

 

you can monitor a file, a program, a process, filesystem

doing various alerts.

 

It can stop/start/restart apps based on rules.

It has it's own http server with it's own built in password authentication.

 

I'm working out how to monitor disks based on the rules. It's entirely feasible to have the scheduling and monitoring all controlled from this daemon.

 

The rules are very easy to work with and can probably be template driven from data in /proc/mdcmd.

 

set eventqueue
     basedir /var/monit
     slots 128

check file syslog path /var/log/syslog
    if does not exist for 1 cycles then exec "/usr/bin/touch /var/log/syslog"
    if size > 90000 then alert
    if size > 90000 then exec "/usr/bin/logger -tmonit[$$] -plocal0.info message from monit"

check file testlog path /var/log/testlog
    if does not exist for 1 cycles then exec "/usr/bin/touch /var/log/testlog"

check program testscript with path "/usr/local/bin/testscript" with timeout 1000 seconds
    if status != 0 then alert

check process emhttp matching "/usr/local/sbin/emhttp"
   group unRAID

check process shfs matching "/usr/local/sbin/shfs"
   group unRAID

check process crond matching "/usr/sbin/crond"
   group system

check process atd matching "/usr/sbin/atd"
   group system

check process inetd with pidfile /var/run/inetd.pid
   group system

check process syslogd with pidfile /var/run/syslogd.pid
   group system

 

 

root@unRAID:/etc/monit.d# cat /usr/local/bin/testscript 
#!/bin/bash
echo "`date +'%b %e %T'` $0[$$]: testing"
exit 99

 

some output

Size failed Service syslog Date:        Sat, 23 Nov 2013 16:12:28 Action:      alert Host:        unRAID Description: size test failed for /var/log/syslog -- current size is 104168 B Your faithful employee, Monit 

 

Syslog messages

Nov 23 16:11:43 unRAID monit[21974]: 'testscript' Nov 23 16:10:43 /usr/local/bin/testscript[22390]: testing  
Nov 23 16:12:27 unRAID monit[22398]: 'syslog' size test failed for /var/log/syslog -- current size is 104168 B 
Nov 23 16:12:28 unRAID monit[22398]: 'syslog' exec: /usr/bin/logger 
Nov 23 16:12:28 unRAID monit[22398]: 'syslog' size test failed for /var/log/syslog -- current size is 104168 B 
Nov 23 16:12:28 unRAID monit[$$]: message from monit

screenshot.jpg.99340c58e95ff38fa70a71628ac0d2eb.jpg

Link to comment

It certainly looks powerful.

 

For reference the syslog thing is trivial to setup in its basic form

 

mkdir /etc/syslog.pipes
mknod /etc/syslog.pipes/criticalMessages p
chmod 600 /etc/syslog.pipes/criticalMessages
echo "*.crit   |/etc/syslog.pipes/criticalMessages" >> /etc/syslog.conf
/etc/rc.d/rc.syslog restart
(crontab -l; echo "#" ) | crontab -
(crontab -l; echo "# Direct critical messages to a named pipe" ) | crontab -
(crontab -l; echo "0-59/5 * * * * /boot/scripts/syslogPushover.sh < /etc/syslog.pipes/criticalMessages > /dev/null 2>&1" ) | crontab -

 

sends me critical syslog alerts via Pushover.

 

What I like is that it really is very elegant, albeit far more basic than monit. nagios etc

Link to comment

What does /boot/scripts/syslogPushover.sh look like?

 

While it looks simple and elegant, I think it can get unwieldy after a while.

More so when check/monitor scripts start to be added.

 

What I like about it,  is the complete variable message is pushed off net to a remote agent.

What I don't like about it regarding the whole franmework of monitoring is that there will be various scripts written by people.

The scheduling of them with cron could end up being a support chore.

 

The way that cron entry is added is fugly (LOL).

I think it would be more manageable with a HERE document such as

 

crontab -l | cat - <<-EOF | crontab -

#

# Direct critical messages to a named pipe

0-59/5 * * * * /boot/scripts/syslogPushover.sh < /etc/syslog.pipes/criticalMessages > /dev/null 2>&1

EOF

 

The real issue is scheduling support later on.

If we had an unRAID crontab interface, then the point is moot. 

Until then, someone has to script it or add it to go.

 

in comparison, the monit has a cool built in cron scheduling feature.

Various tests can be scheduled in cron like syntax and it's all in one file or in multiple files in the include directory.

 

using the include etc/monit.d/*.conf syntax.

 

people can write monitor plugins with a script and a monit.conf file all in one.

Then do /usr/sbin/monit -c /etc/monit.conf validate and/or /usr/sbin/monit -c /etc/monit.conf reload and be on their way.

 

Here's an interesting snippit that I was playing with a while ago.

I got an email notification, a syslog message and a growl message

 

check file syslog path /var/log/syslog
    if does not exist for 1 cycles then 
        exec "/usr/bin/touch /var/log/syslog"
    if size > 90000 then alert
    if size > 90000 then 
        exec "/usr/bin/logger -tmonit[$$] -plocal0.info message from monit"
    if size > 90000 then 
        exec "/usr/bin/gntp-send -a UNRAID -n ROB -s e6510-pc 'syslog size warning' '/var/log/syslog larger then expected'"

 

 

Here's an interesting possibility for hdd temperature or smartctl attribute monitor.

check program HDD_80 with path "/usr/local/lib/monit/libexec/sdc_temp.sh"
    if status > 45 then alert
    group temperature

 

script /usr/local/lib/monit/libexec/sdc_temp.sh

#!/bin/sh
TP=`/usr/sbin/smartctl -a /dev/sdc | grep ' Temp' | awk -F " " '{printf "%d",$10}'`
echo $TP # for debug only
exit $TP

Link to comment

pushover notifier

 

#!/bin/bash
# syslogPushover: a script to read stdin and turn each line into an alert
# example usage: syslogPushover < /etc/syslog.pipes/criticalMessages

TMOUT=1                                   # don't wait > 1 second for input

# process each line of input and produce an alert
while read line
do
   # remove any repeated messages
   echo ${line} | grep "message repeated" > /dev/null 2>&1
   if test $? -eq 1
   then
      # send the alerts
  # Pushover
  /usr/bin/curl -s -F "token=XXX" -F "user=XXX" -F "message=${line}" https://api.pushover.net/1/messages.json
   fi
done

 

I know some of it is a kludge it was a quick proof of concept only.

 

I am in 2 minds. Monit looks nicer and is certainly more scalable and powerful but its also much more complicated. With so much more code we risk never getting it included. Conversely a few lines of code manipulating what is already included in unRAID is almost certain to be accepted.

 

Personally I would prefer monit but we need a feel for Limetech's opinion of inclusion since they will have to support it. I suspect that will kill it as an option immediately. Having it as a user addon is not an option since what we are looking at here is stock notification.

Link to comment

I am in 2 minds. Monit looks nicer and is certainly more scalable and powerful but its also much more complicated. With so much more code we risk never getting it included. Conversely a few lines of code manipulating what is already included in unRAID is almost certain to be accepted.

 

Personally I would prefer monit but we need a feel for Limetech's opinion of inclusion since they will have to support it. I suspect that will kill it as an option immediately. Having it as a user addon is not an option since what we are looking at here is stock notification.

 

To be frank, I never expected Limetech to include monit as a monitoring solution.

 

What I do expect is for some email agent to be included at a minimum and not depend on the community for it.

I also request that curl,libcurl and php be compiled with it so the community can help themselves.

 

At the very least it provides building blocks for externalized and automated communication.

 

Since these are part of the standard slackware, they can be included without too much effort.

 

Regarding monit and monitoring, that's a bigger piece that goes in the middle.

 

First effort is a few building blocks.

Next effort is connecting them for minimal notification (via syslog works).

After that is a few automated scripts to monitor some resources, and report them via a specific syslog facility.

 

Now how are these scheduled?  Well that's beyond me at the current moment, but I'm in the mindset that emhttp should probably schedule at least some type of smart check thread since it already monitors disks and spins them down.

 

To help ourselves, it could capture the smart logs and drop them into a specific directory on the ram disk.

We could write our own scripts to inspect them (at least for the first go round).

 

I mention having emhttp save the smart logs because it already calls smartctl to capture the temperature in a pipe.

So if it stored the smartctl log somewhere in a normalized 'name' that we could inspect, the work of calling smartctl is done.

 

root@unRAID:/usr/local/sbin# strings emhttp | grep smartctl

strings emhttp | grep smartctl

/usr/sbin/smartctl -n standby %s -A /dev/%s

 

We would only need to write scripts to inspect the pending sectors or other facilities.

monit allows the ability to spawn a script based on a file change.

 

To be frank, it's not beyond a vendor to include a monitor daemon.  Every NAS I've used before had one.

Some relied on smartd, others had their own with a browser interface to the SMART attributes.

 

Since emhttp uses smartctl to capture the smart values, it should cache them on the ramdisk. They are not large.

From there we can create a plugin that will display them or update webGui to display them.

 

Now back to the syslog pipe script.  While it works for you, it's not of a design for parallel notifications and/or reliable delivery of them.

using a fifo on a filesystem to cache/queue messages is perilous. 

With out a writer/reader that are in sync one or the other process can hang.

 

For the script there is the read timeout.

For syslog, that requires inspection of the source code. What happens if there is no reader on the pipe for an extended period of time.

let's say someone clobbers the script or has it only run once a day?

 

Ideally I'm in the camp this should be done in the old logcheck way of yesteryear.

 

A program called logtail reads a log file, and tracks it's inode/offset. It reads from the last point it read upon every invocation.

If the inode or size change, the offset file is restarted.  This means logtail only reads chunks of files at a time.

all information is still cached in the log.

 

logtail then pipes it's output to fgrep.

fgrep can be used to include/exclude messages.

The output of fgrep can now be mailed or fed line by line into another notification tool.

 

The benefit of this approach is that you can have more then one of these log scrapers running with out interfering with one another.

Granted it will take time to work out the regular expressions, but it's a best effort case of dealing with syslog messages from multiple processes.

 

As I mentioned previously, I created the ofgrep which is logtail and fgrep all in one program.

 

This puts limetech's inclusion to.

 

a mailer

curl/libcurl, php compiled with libcurl

logtail or inclusion of my ofgrep.

 

Create a webgui for.

1. Mailer destination management.

2. Management of the regular expressions.

 

Limetech should capture the smartlogs that it reads, and cache them on the disk in a name normalized fashion. (I'm open on this one).

I prefer using serial numbers so the information can be diffed and stored elsewhere for history.

 

At some point limetech could/should, have smartd started and configured to monitor the disks letting smartd do the attribute monitoring and emailing (still requires an email agent).

 

Phew, Ok, that's enough of my long winded thoughts for now.

Link to comment

I think this thread has been worth its weight in gold as we have hammered out quite a few things.

 

However when I started it I envisioned that we would diverge significantly into the "trap" side of things and its the reason i attempted to limit it to notifications only.

 

I think that was folly since in an ideal world the two should not be separate however I think in the situation we are in it has to be this way.

 

So in some respects we have a few comments of "just add email support" and whilst I agree this is the most important notifier type I personally believe it is the worst (and is certanily only of many)

 

I would like to propose we put a line under the larger picture of monit, nagios, syslog etc and concentrate solely on notifiers.

 

 

Specifically this is limited to all things related to getting the alarm out of the server and to the end user using as few or as many forms of communication as they want and is completely agnostic of how the alert got to point the notifier took over.

Link to comment

Perhaps creating a new poll of notification options and seeing what people are interested in.

Or at least a multiple choice list of first second third, etc, etc.

 

I bet you will (or won't) be surprised at the volume requesting email notification support.

 

I would like to propose we put a line under the larger picture of monit, nagios, syslog etc and concentrate solely on notifiers.

 

Specifically this is limited to all things related to getting the alarm out of the server and to the end user using as few or as many forms of communication as they want and is completely agnostic of how the alert got to point the notifier took over.

 

I'm having a hard time limiting myself to only the candy wrapper and not the candy.

 

When the thread is titled "framework for notifications"  I don't think you can only look at how shit is thrown out the window. You have to have someone shovel it, pick it up, cock the arm, then throw it through the frame. Oh wait, there is no frame, it's a wall!!!!  I mean this humorously.  ;D

 

I still suggest the poll, let's see how many people are interested in other mechanisms.

 

I would add to that list.

 

Curl/LibCurl and PHP Compiled with CURL for custom Socket/API access.

Link to comment

Not everyone spends time on the forums and/or could miss the poll. So you may or may not get what your looking for from a poll.

 

As a prospective buyer, you notice unRAID has no email notification (nor any for that matter), or non techies who don't wish to use plugins and sorely wish they had notification.

 

If memory serves, Tom loves hearing from new & non-tech clients on the ease of things. So for sales and support, that clearly dictates a base email notification system. You guys don't have to like it but that is a fact. You can appease a small number of advance users or the masses, if it where your business I think it goes without saying what should be added asap.

 

A poll for what to be monitored and notified on would be of more use so the coding could commence with that in mind.

Tom, should start a poll like that in the announcement thread where the most views are.

 

Not sure what this thread accomplishes, as outside of you 2 everyone else posted basically 'give me email alerts for starts'. So if Tom wants to chime in what exactly is it that he's looking for in this thread... maybe there would be more input from others. Otherwise this is all speculation of "A framework for notification"

Link to comment

As far as monit goes, I have an existing beta plugin that does much of the work automatically and could be easily built upon to do what I have gleaned from this topic is desired.

 

The functionality of the plugin could easily be worked into core. Basically what I did was write a few scripts to scrape the disk data to monitor space, then another to monitor temps. Since the temps is really just a returned value monits web app doesn't express it in the prettiest fashion.

Link to comment

As far as monit goes, I have an existing beta plugin that does much of the work automatically and could be easily built upon to do what I have gleaned from this topic is desired.

 

The functionality of the plugin could easily be worked into core. Basically what I did was write a few scripts to scrape the disk data to monitor space, then another to monitor temps. Since the temps is really just a returned value monits web app doesn't express it in the prettiest fashion.

 

If you could make it available via a google code page or something else I'll peek at it.

In reality, we don't need specifics in the monit browser.

We need to know when we pass an alarmed value so we can react.

 

I'll start another thread on monit as a potential monitor application in a day or so.

In the meantime I've been working on capturing the external test script output so it can be viewed on the service page.

If the status of an external program comes back as 'failed' the output is captured. But if it comes back as success, it is not captured. To me that needs to change. internally there are these big structures that are passed around as the objects.  There's about 50 files so I'm trying to learn what's going on.  Someone else wrote some cool functions to allow editing of these scripts/files from the browser page. I'm still evaluating. In the meantime any plugin scripts we can warehouse and include as addons will be of great value.

 

monit can be coded to call any external program(notifier) on event, I'm sure it will fit in fine. 

nagios can do the same also. It's not as simple as monit, but it's also designed to schedule and centralize monitoring.

Using m/monit as a commercial upgrade/addon multiple servers can be managed as one.

 

In any case, it's time for us to move this to another thread. Give me a little time. I'm still compiling options and exploring it's use.

Link to comment

As far as monit goes, I have an existing beta plugin that does much of the work automatically and could be easily built upon to do what I have gleaned from this topic is desired.

 

The functionality of the plugin could easily be worked into core. Basically what I did was write a few scripts to scrape the disk data to monitor space, then another to monitor temps. Since the temps is really just a returned value monits web app doesn't express it in the prettiest fashion.

 

it's not on github?

if you need me to test, let me know...

Link to comment

...

When the thread is titled "framework for notifications"  I don't think you can only look at how shit is thrown out the window. You have to have someone shovel it, pick it up, cock the arm, then throw it through the frame. Oh wait, there is no frame, it's a wall!!!!  I mean this humorously.  ;D

...

 

A notification is the thing that is sent only. i.e. you could send 1000 empty emails and you would still say you received 1000 emails.

 

Again lets not get hung up on the word.

 

As you have said I think we are all happy to keep talking about the whole topic but I think I am going to get a bit more focused and agree that anything that is likely is not going to be native is OT for this thread and should be split out.

 

If it has to be an addon of any sort then it is a solution to a different problem.

 

Fundamentally the native basic requirements are far simpler than what could be done given more time and effort.

 

 

So the topic if this thread again is..

 

 

assuming some alarms already exist we need a frame work where we can develop a bunch of notifiers to get these alarms to the end users. This should include sanitisation of the data to probably also extend to capturing the data types we need to send

Link to comment

 

I think if I have to limit the view, I've presented the only option I can think of in this scope.

That is one or more log scraper(s).

 

Please itemize what you feel the requirements are.

 

This is what I've seen so far.

 

define and document the inputs

create a proof of concept (POC) input sanitiser

work out how we reliably pass info from the sanitiser to a number of notifiers in parallel

create a POC mail notifier

create a POC pushover or similar technology notifier

 

anything that is likely is not going to be native is OT for this thread and should be split out.

If it has to be an addon of any sort then it is a solution to a different problem.

 

Just lets keep it elegant, most of uNRAID userbase just want an email if a disk or fs error happens.

 

I cannot see emHTTP being a requirement, it surely can trigger an external event just like anything else? It also allows for the possibitly of the system alarming if emHTTP goes down which is probably a good idea.
Link to comment

Monit sends a notification itself ...

 

I do not think that is the way to go. It is unlikely monit will ever be added natively so we should not rely on it as the means to send native notifications.

 

I imagine a situation where any addon, emHTTP or monot etc sends notifications via the mechanism we are trying to work out here.

 

This means a user configures notifications once only and then any tool that needs to can send critical notificaitons without any further user configuration

Link to comment
  • 4 weeks later...

Just bought the Pushover app for my note 3 yesterday, and got both Couchpotato and Sickbeard working with it. That night itself it proved useful as I got a notification that CP downloaded something which just wasn't what I wanted. VPN'ed into my home network, and deleted what it downloaded.

 

It would definitely be great if we could get notification for errors/issues with unRAID itself, and also maybe info like "monthly parity check completed at ..." With things like this, it would minimise the need to look at syslogs or the unRAID menu itself.

Link to comment