Monitoring Bacula Jobs using Nagios

My previous homebrew backup system had a number of drawbacks, one of the biggest was that its daily emails were massive, listing all the files that was backed up.

With a lot of machines being backed up these mails can come to several MB per day but also general Human Nature means I just didn't pay them enough attention. For instance, I would need to somehow notice if on a given day the tar died half way through by manual inspection, this was pretty useless.

Bacula provides good one-page job status emails on a daily basis but still I tend to not look at them as I will get about 20 of them a day, the ideal situation is to have it only mail you on errors and it does support this. There is one problem with this though, if anything prevents the mail from getting to you, or in-fact if the whole Director process dies and no backups get run at all you just wont know about it.

I've written a per-job monitoring solution that uses Bacula's ability to run a script on the client after a successful backup has been run, it writes a small status file with a timestamp, this I pull into Net-SNMP and query over the network using Nagios.

Now if any of my jobs fail or if the whole backup system collapses Nagios will notify me via my already existing notification systems, email and SMS in my case. I will still get the Error mails from Bacula but I totally do not rely on them, they are merely there for information purposes so I can use them to quickly investigate a error once Nagios has alerted me.

I've documented this and put up the short scripts I use to achieve this, you can see this document in my wiki

Leave a comment

Recent Entries

  • flashpolicyd 2.0

    I wrote a multi threaded server for Adobe Flash Policy requests, some background from Adobe:Since policy files were first introduced, Flash Player has recognized /crossdomain.xml...

  • Adventures with Ruby

    Some more about my continuing experiences with ruby, in my last post I saidthe language does what you'd expect and as you'll see in my...

  • New programming language of choice - Ruby

    I have fallen out of love with Perl some time ago, I cannot point to one specific thing about it that put me off, I...

  • On working from home

    I've not been posting much here, work has been incredibly manic the last while, especially I need to still finish off my SSO posts with...

  • Rework of puppet facts for /etc/facts.txt

    Previously I blogged a custom fact that reads /etc/facts.txt to build up some custom facts for use in Puppet manifests, well I've since learned a...

Close