cancel
Showing results for 
Search instead for 
Did you mean: 

How to send email notifications on monitor failure

SOLVED
Highlighted
krm
Occasional Contributor

How to send email notifications on monitor failure

I was surprised to see that there is no facilities available for transmitting errors via email on various failures.

I read the section in the manual regarding perl based monitor programs.  I thought I'd try to write a simple monitor script to also send emails to certain administrators based on which pool the failure had been seen.

What is the recommended way to provide outage notifications?  We have over a hundred rules and pools with various owners across our organization.  We need to direct outage notifications to owners of particular services.

Also, when I logged into the SteelApp server, I wasn't able to run perldoc or determine which binary the Monitor Program features are related to.  I'm not quite sure how to run this command:

root@ttsv-stm-01:/opt/zeus/zxtm/bin# perldoc $ZEUSHOME/zxtm/lib/perl/Zeus/ZXTM/Monitor.pm

You need to install the perl-doc package to use this program.

root@ttsv-stm-01:/opt/zeus/zxtm/bin# which perldoc

/usr/bin/perldoc

root@ttsv-stm-01:/opt/zeus/zxtm/bin# which perl

/usr/bin/perl

Also if I take this route, how is the best method to configure SMTP for email transmission?

3 REPLIES
Contributor

Re: How to send email notifications on monitor failure

There is the capability in the traffic manager for emailing notification of failures to a set of recipients.  This falls under the "Alerting" facility provided by the traffic manager - see Chapter 21 "Event Handling and Alerts" of the User Manual.

If using the administrative UI, logging in then going to System > Alerting shows (by default) a single "Event Type" to "Action" entry for the standard behaviour of logging all errors to the event log.

For creating custom notifications, create a new event type (there are some built-in types that may be useful for other use-cases) - here you can add the set of events you would like to trigger the alerting action.  For your use-case there are a couple of options (from my limited understanding of your deployment):

  1. If each "service owner" has an individual monitor for their service, you can simply trigger an action on the monitorfail event (for each given monitor) - you can find this under Monitors > Warnings.
  2. If monitors are shared among several services with distinct owners, a better option may be to trigger an action on the nodefail event (for each given pool) - you can find this under Pools > General Events > Serious Errors.

From what you've described, it seems like option (2) would be the most appropriate.

Once you have an appropriate event type, return to the top level Alerting page and add a new "Action" by going to "Manage Actions"; here you can set-up the details for sending an email to a particular set of "service owners".

Finally, from the top level Alerting page, you can add a new mapping from your custom event type to custom action (note that you can add multiple actions to every event type).

Hopefully I've not mis-understood your use-case and this solves your problem.  As well as the administrative UI, you should be able to programmatically set-up such configurations using the SOAP/zcli or REST interfaces.

krm
Occasional Contributor

Re: How to send email notifications on monitor failure

Thanks James'.  I read the manual a bit more and was able to create new custom event list and assign them to dedicated email actions.  It appears to be doing what I want except I'm now receiving two emails per notification which is probably due to fact that we have traffic managers deployed in two locations and have multi-site cluster management enabled.  We also use the GSLB functionality, so both appliances are likely to be doing health checking from what I understand.  How should we configure the appliance so we only get a single email for pool server outages?

Contributor

Re: How to send email notifications on monitor failure

I do not believe there is a built in mechanism of consolidating email alerts. 

Each traffic manager operates independently of others in order to provide high availability; alerts are triggered and reported by each traffic manager as they detected without coordination with other traffic managers that may or may not be available.

One option is that you should be able to vary eventing behaviour by MSM location.  Here you would only have the event triggered in a single location - but this then requires you to have a single traffic manager in each location otherwise you'll have the same situation you have today and if the traffic manager in that location fails then the events won't be emitted for your users by the remaining traffic manager(s).  So I would hesitate in recommending this option.


Another approach, depending on your users email clients, you may be able to simply have all traffic managers uses a single (not host specific) email address for the "from".  It maybe that the email clients themselves will collapse (mostly) duplicated emails for the users as they are read.  This is not likely to be a complete solution though.

Other alternatives would be to utilise a consolidation service via some other means - rather than have the traffic managers directly email your users you could;

  • have an alert send a message to a remote syslog service, which if capable, could aggregate and send appropriate emails.
  • provide a SOAP service that receives reports, aggregates and then sends emails.
  • anything other service, an arbitrary script can be triggered to execute on an event, meaning you could integrate with other solutions.

These approaches require an external service be present to perform the consolidation of alerts for your users, and I'm afraid I don't know any off-the-top-of-my-head that you could use.