www.devco.net by r.i.pienaar

24Jan/100

MCollective 0.4.3 Auditing

I just released version 0.4.3 of mcollective which brings a new auditing capability to SimpleRPC. Using the auditing system you can log to a file on each host every request or build a centralized auditing system for all requests on all nodes.

We ship a simple plugin that logs to the local harddrive but there is also a community plugin that creates a centralized logging system running over MCollective as a transport.

This is the kind of log the centralized logger will produce:

01/24/10 18:24:20 dev1.my.net> d53a8306f20e9b3a0f7946adccd6eb5e: 01/24/10 18:24:20 caller=uid=500@ids1.my.net agent=iptables action=block
01/24/10 18:24:20 dev1.my.net> d53a8306f20e9b3a0f7946adccd6eb5e: {:ipaddr=>"114.255.136.120"}
01/24/10 18:24:20 dev2.my.net> d53a8306f20e9b3a0f7946adccd6eb5e: 01/24/10 18:24:20 caller=uid=500@ids1.my.net agent=iptables action=block
01/24/10 18:24:20 dev2.my.net> d53a8306f20e9b3a0f7946adccd6eb5e: {:ipaddr=>"114.255.136.120"}
01/24/10 18:24:20 dev3.my.net> d53a8306f20e9b3a0f7946adccd6eb5e: 01/24/10 18:24:20 caller=uid=500@ids1.my.net agent=iptables action=block
01/24/10 18:24:20 dev3.my.net> d53a8306f20e9b3a0f7946adccd6eb5e: {:ipaddr=>"114.255.136.120"}

Here we see 3 nodes that got a request to add 114.255.136.120 to their local firewall. The request was sent by UID 500 on the machine ids1.my.net. The request is of course the same everywhere so the request id is the same on every node, the log shows agent and all parameters passed.

14Jan/100

Better way to query facts

Facter has some annoying bug where it won't always print all facts when called like facter fact, ones that require dynamic lookups etc just won't print.

This is a long standing bug that doesn't seem to get any love, so I hacked up a little wrapper that works better.

#!/usr/bin/ruby
 
require 'facter'
require 'puppet'
 
Puppet.parse_config
unless $LOAD_PATH.include?(Puppet[:libdir])
    $LOAD_PATH << Puppet[:libdir]
end
 
facts = Facter.to_hash
 
if ARGV.size > 0
    ARGV.each do |f|
        puts "#{f} => #{facts[f]}" if facts.include?(f)
    end
else
    facts.each_pair do |k,v|
        puts("#{k} => #{v}")
    end
end

It behaves by default as if you ran facter -p but you can supply as many fact names as you want on the command line to print just the ones requested.

$ fctr uptime puppetversion processorcount
uptime => 8 days
puppetversion => 0.25.2
processorcount => 1
Tagged as: , , No Comments
13Jan/100

MCollective 0.4.2 released

Just a quick blog post for those who follow me here to get notified about new releases of MCollective. I just released version 0.4.2 which brings in big improvements for Debian packages, some tweaks to command line and a bug fix in SimpleRPC.

Read all about it at the Release Notes

13Jan/102

Backing up Google Code projects

Google Code does not provide it's own usable export methods for projects so we need to make do on our own, there seems no sane way to back up the tickets but for SVN which includes the wiki you can use svnsync.

Here's a little script to automate this, just give it a PREFIX of your choice and a list of projects in the PROJECTS variable, cron it and it will happily keep your repos in sync.

It outputs its actions to STDOUT so you should add some redirect or redirect it from cron.

#!/bin/bash
 
PROJECTS=( your project list )
PREFIX="/path/to/backups"
 
[ -d ${PREFIX} ] || mkdir -p ${PREFIX}
 
cd ${PREFIX}
 
for prj in ${PROJECTS[@]}
do
    if [ ! -d ${PREFIX}/${prj}/conf ]; then
        svnadmin create ${prj}
 
        ln -s /bin/true ${PREFIX}/${prj}/hooks/pre-revprop-change
 
        svnsync init file:///${PREFIX}/${prj} http://${prj}.googlecode.com/svn
    fi
 
    svnsync sync file:///${PREFIX}/${prj}
done

I stongly suggest you backup even your cloud hosted data.

2Jan/103

Do your backups!

I have a QNAP TS-209 Nas device. It's a Linux based appliance with 2 hot swap drives.

It has now died by the looks of it, QNAP support has been utterly useless to say the least but I have pretty much resolved to just replacing this unit even if they are able to resurrect it. The problem with the 1xx and 2xx range of QNAP is that its some weird CPU architecture and to enable huge files on them they had to patch the ext3 file system.

The end result is that while the devices are advertised as being ext3 they are in-fact a patched ext3 and you cannot just mount them in a Linux machine. They have also now stopped selling this series of machine so should yours ever die you are just plainly out of luck. QNAP have made a live cd available that's similarly patched so you should have some hope if you are really in trouble.

In my case the device seem to have also totally corrupted the drives when it died so even in the Live CD scenario both are dead. It seems the SATA interface has gone rather than the disks, the moment I put a disk in it seems the CPU is totally kept busy dealing with blocking I/O requests, out of a 1000 pings about 20 will get replies - and those will be 30 second response times.

This brings me to several points:

  • Everyone knows this (right?) but RAID is not a form of backup, it's most probably that if one drive in a RAID array gets it's data corrupted the others will suffer too. It simply protects you against hardware failure on a single drive.
  • You should make backups regularly, as it turns out I made a backup just 12 days before it died so every file that was on the NAS is safe.

I've now spent the last 2 or so days duplicating my backups so I am redundantly covered while I look for a replacement. I'd have liked to not buy another QNAP but it's really unfortunate that they do seem to have the best range of products in this space. All the vendors seem to have stopped selling 2 drive units so I am down to getting a 4 drive QNAP TS-439 now, this will set me back almost GBP900 but will give me 2TB of mirrored space, apple Time Machine backups for all my macs etc, pricey but important given that this is all my photos and music.

In the same week it seems my Apple iMac 24 inch has had similar problems. It isn't booting, it seems a similar problem has afflicted it, I am not getting the usual I/O errors I saw on other macs when their drives died instead it's other I/O timeouts that suggest more it's the controller and not the drives. Thankfully I have Apple Care so it's in for a free fixup. I do not have backups of this machine - that's by design - since I keep all my data on servers on the internet and those are backed up off-site nightly. My desktops tend to be disposable and simply terminals to online data even browser bookmarks are stored online. The only thing lost on this machine would be chat logs and browser history, nothing else. I need to make some kind of plan with chat logs as those do tend to be more and more important these days.

So to sum up, even if you have multi redundancy in your drives in a NAS you must still do your backups, it's easy with QNAPs to even do it off-site as you can rsync to a remote location or even sync to Amazon S3. Of course they also have USB ports so you can place files on an external drive.

Tagged as: , , 3 Comments
2Jan/100

MCollective Release 0.4.x

A few days ago I released Marionette Collective version 0.4.0 and today I released 0.4.1. This release branch introduce a major new feature called Simple RPC.

In prior releases it took quite a bit of ruby knowledge to write a agent and client. In addition clients all ended up implementing their own little protocols for data exchange. We've simplified agents and clients and we've created a standard protocol between clients and agents.

Standard protocols between clients and agents means we have a standard one-size-fits-all client program called mc-rpc and it opens the door to writing simple web interfaces that can talk to all compliant agents. We've made a test REST <-> Simple RPC bridge as an example.

Writing a client can now be done without all the earlier setup, command line parsing and so forth, it can now be as simple as:

require 'mcollective'
 
include MCollective::RPC
 
mc = rpcclient("rpctest")
 
printrpc mc.echo(:msg => "Welcome to MCollective Simple RPC")
 
printrpcstats

This simple client has full discovery, full --help output, and takes care of printing results and stats in a uniform way.

This should make it much easier to write more complex agents, like deployers that interact with packages, firewalls and services all in a single simple script.

We've taken a better approach in presenting the output from clients now, instead of listing 1000 OKs on success it will now only print whats failing.

Output from above client would look something along these lines:

$ hello.rb
 
 * [ ============================================================> ] 43 / 43
 
Finished processing 43 / 43 hosts in 392.60 ms

As you can see we have a nice progress indicator that will work for 1 or 1000 nodes, you can still see status of every reply by just running the client in verbose - which will also add more detailed stats at the end.

Agents are also much easier, here's a echo agent:

class Rpctest<RPC::Agent
    def echo_action
         validate :msg, :shellsafe
 
         reply.data = request[:msg]
    end
end

You can get full information on this new feature here. We've also created a lot of new wiki docs about ActiveMQ setup for use with MCollective and we've recorded a new introduction video here.

22Dec/090

MCollective Simple RPC

MCollective is a framework for writing RPC style tools that talk to a cloud of servers, till now doing that has been surprisingly hard for non ruby coders. The reason for this is that I was focussing on getting the framework built and feeling my way around the use cases.

I've now spent 2 days working on simplifying actually writing agents and consumers. This code is not released yet - just in SVN trunk - but here's a taster.

First writing an agent should be simple, here's a simple 'echo' server that takes a message as input and returns it back.

class Rpctest<RPC::Agent
    # Basic echo server
    def echo_action(request, reply)
         raise MissingRPCData, "please supply a :msg" unless request.include?(:msg)
 
         reply.data = request[:msg]
    end
end

This creates an echo action, does a quick check that a message was received and sends it back. I want to create a few more validators so you can check easily if the data passed to you is sane and secure if you're doing anything like system() calls with it.

Here's the client code that calls the echo server 4 times:

#!/usr/bin/ruby
 
require 'mcollective'
 
include MCollective::RPC
 
rpctest = rpcclient("rpctest")
 
puts "Normal echo output, non verbose, shouldn't produce any output:"
printrpc rpctest.echo(:msg => "hello world")
 
puts "Flattened echo output, think combined 'mailq' usecase:"
printrpc rpctest.echo(:msg => "hello world"), :flatten => true
 
puts "Forced verbose output, if you always want to see every result"
printrpc rpctest.echo(:msg => "hello world"), :verbose => true
 
puts "Did not specify needed input:"
printrpc rpctest.echo

This client supports full discovery and all the usual stuff, has pretty --help output and everything else you'd expect in the clients I've supplied with the core mcollective. It caches discovery results so above code will do one discovery only and reuse it for the other calls to the collective.

When running you'll see a twirling status indicator, something like:

  - [5 / 10]

This will give you a nice non scrolling indicator of progress and should work well for 100s of machines without spamming you with noise, at the end of the run you'll get the output.

The printrpc helper function tries its best to print output for you in a way that makes sense on large amounts of machines.

  • By default it doesn't print stuff that succeeds, you do get a overall progress indicator though
  • If anything does go wrong, useful information gets printed but only for hosts that had problems
  • If you ran the client with --verbose, or forced it to verbose mode output you'll get a full bit of info of the result from every server.
  • It supports flags to modify the output, you can flatten the output so hostnames etc aren't showed, just a concat of the data.

The script above gives the following output when run in non-verbose mode:

$ rpctest.rb --with-class /devel/
 
Normal echo output, non verbose, shouldn't produce any output:
 
Forced verbose output, if you always want to see every result:
dev1.your.com                          : OK
    "hello world"
 
dev2.your.com                          : OK
    "hello world"
 
dev3.your.com                          : OK
    "hello world"
 
Flattened echo output, think combined 'mailq' usecase:
hello world
hello world
hello world
 
Did not specify needed input:
dev1.your.com                          : please supply a :msg
dev2.your.com                          : please supply a :msg
dev3.your.com                          : please supply a :msg

Still some work to do, specifically stats needs a rethink in a scenario where you are making many calls such as in this script.

This will be in mcollective version 0.4 hopefully out early January 2010

14Dec/090

Exim, MCollective and speed

Usually when I describe mcollective to someone they generally think its nice and all but the infrastructure to install is quite a bit and so ssh parallel tools like cap seems a better choice. They like the discovery and stuff but it's not all that clear.

I have a different end-game in mind than just restarting services, and I've made a video to show just how I manage a cluster of Exim servers using mcollective. This video should give you some ideas about the possibilities that the architecture I chose brings to the table and just what it can enable.

While watching the video please note how quick and interactive everything is, then keep in mind the following while you are seeing the dialog driven app:

  • I am logged in via SSH from UK to Germany into a little VM there
  • The mcollective client talks to a Germany based ActiveMQ
  • The 4 mail servers in the 2nd part of the demo are based 2 x US, 1 x UK and 1 x DE
  • I have ActiveMQ instances in each of the above countries clustered together using the technique previous documented here.

Here's the video then, as before I suggest you hit the full screen link and watch it that way to see what's going on.




This is the end game, I want a framework to enable this kind of tool on Unix CLI - complete with pipes as you'd expect - things like the dialog interface you see here, on the web, in general shell scripts and in nagios checks like with cucumber-nagios, all sharing a API and all talking to a collective of servers as if they are one. I want to make building these apps easy, quick and fun.

14Dec/091

Splitting MySQL dumps by table – take 2

A few days ago I posted about splitting mysqldump files using sed and a bit of Ruby to drive it, turns out that sucked, a lot.

I eventually killed it after 2 days of not finishing, the problem is, obviously, that sed does not seek to the position, it reads the whole file. So pulling out the last line of a 150GB file requires reading 150GB of data, if you have 120 tables this is a huge problem.

The below code is a new take on it, I am just reading the file with ruby and spitting out the resulting files with 1 read operation, start to finish on the same data was less than a hour. When run it gives you nice output like this:

Found a new table: sms_queue_out_status
    writing line: 1954 2001049770 bytes in 91 seconds 21989557 bytes/sec
 
Found a new table: sms_scheduling
    writing line: 725 729256250 bytes in 33 seconds 22098674 bytes/sec

The new code below:

#!/usr/bin/ruby
 
if ARGV.length == 1
    dumpfile = ARGV.shift
else
    puts("Please specify a dumpfile to process")
    exit 1
end
 
STDOUT.sync = true
 
if File.exist?(dumpfile)
    d = File.new(dumpfile, "r")
 
    outfile = false
    table = ""
    linecount = tablecount = starttime = 0
 
    while (line = d.gets)
        if line =~ /^-- Table structure for table .(.+)./
            table = $1
            linecount = 0
            tablecount += 1
 
            puts("\n\n") if outfile
 
            puts("Found a new table: #{table}")
 
            starttime = Time.now
            outfile = File.new("#{table}.sql", "w")
        end
 
        if table != "" && outfile
            outfile.syswrite line
            linecount += 1
            elapsed = Time.now.to_i - starttime.to_i + 1
            print("    writing line: #{linecount} #{outfile.stat.size} bytes in #{elapsed} seconds #{outfile.stat.size / elapsed} bytes/sec\r")
        end
    end
end
 
puts
Tagged as: , , 1 Comment
11Dec/090

Splitting MySQL dumps by table

I often need to split large mysql dumps into smaller files so I can do selective imports from live to dev for example where you might not want all the data. Each time I seem to rescript some solution for the problem. So here's my current solution to the problem, it's a simple Ruby script, you give it the path to a mysqldump and it outputs a string of echo's and sed commands to do the work.

UPDATE: Please do not use this code, it's too slow and inefficient, new code can be found here.

Just pipe it's output to a file and run it via shell when you're ready to do the splitting. At the end you'll have a file per table in your cwd.

#!/usr/bin/ruby
 
prevtable = ""
prevline = 0
 
if ARGV.length == 1
    dumpfile = ARGV.shift
else
    puts("Please specify a dumpfile to process")
    exit 1
end
 
if File.exist?(dumpfile)
   %x[grep -n "Table structure for table" #{dumpfile}].each do |line|
       if line =~ /(\d+):-- Table structure for table .(.+)./
           curline = $1.to_i
           table = $2
 
           unless prevtable == ""
               puts("echo \"\`date\`: Processing #{prevtable} - lines #{prevline - 1} to #{curline - 2}\"")
               puts("sed -n '#{prevline - 1},#{curline - 2}p;#{curline - 2}q' #{dumpfile} > #{prevtable}.sql")
               puts
           end
 
           prevline = curline
           prevtable = table
       end
   end
else
   puts("Can't find dumpfile #{dumpfile}")
   exit 1
end

It's pretty fast, the heavy lifting is all done with grep and sed, ruby just there to drive those commands and parse a few lines of output.

Running it produces something like this:

$ split-mysql-dump.rb exim.sql
echo "`date`: Processing domain_sender_whitelist - lines 32 to 47"
sed -n '32,47p;47q' exim.sql > domain_sender_whitelist.sql
 
echo "`date`: Processing domain_valid_users - lines 48 to 64"
sed -n '48,64p;64q' exim.sql > domain_valid_users.sql
Tagged as: , , No Comments