Custom deployer using MCollective
One of the goals of building the SimpleRPC framework and the overall speed of MCollective is to create interactive tools to manage your infrastructure in a way that it all just seems like a single point of entry with one machine. I've blogged a bit about this before with how I manage Exim clusters.
I've recently built a deployer for a client that does some very specific things with their FastCGI, packages and monitoring in a way that is safe for developers to use. I've made a sanitized demo of it that you can see below. It's sanitized in that the hostnames are replaced with hashes and some monitoring details removed but you'll get the idea.
As usual it's best to just look at the video on youtube in it's HD mode.
MCollective Agent Introspection
With the new SimpleRPC system in MCollective we have a simple interface to creating agents. The way to call an agent would be:
$ mc-rpc service status service=httpd
This is all fine and well and easy enough, however it requires you to know a lot. You need to know there's a status action and you need to know it expects a service argument, not great.
I'm busy adding the ability for an agent to register its metadata and interface so that 3rd party tools can dynamically generate useful interfaces.
A sample registration for service agent is:
register_meta(:name => "SimpleRPC Service Agent", :description => "Agent to manage services using the Puppet service provider", :author => "R.I.Pienaar", :license => "GPLv2", :version => 1.1, :url => "http://mcollective-plugins.googlecode.com/", :timeout => 60) ["start", "stop", "restart", "status"].each do |action| register_input(:action => action, :name => "service", :prompt => "Service Name", :description => "The service to #{action}", :type => :string, :validation => '^[a-zA-Z\-_\d]+$', :maxlength => 30):
This includes all the meta data, versions, timeouts, validation of inputs, prompts and help text for every input argument.
Using this we can now generate dynamic UI's, and do something like JavaDoc generated documentation. I've recorded a little video demonstrating a proof of concept Text UI that uses this data to generate a UI dynamically. This is ripe for integration into tools like Foreman and Puppet Dashboard.
Please watch the video here, best viewed full screen.
MCollective 0.4.3 Auditing
I just released version 0.4.3 of mcollective which brings a new auditing capability to SimpleRPC. Using the auditing system you can log to a file on each host every request or build a centralized auditing system for all requests on all nodes.
We ship a simple plugin that logs to the local harddrive but there is also a community plugin that creates a centralized logging system running over MCollective as a transport.
This is the kind of log the centralized logger will produce:
01/24/10 18:24:20 dev1.my.net> d53a8306f20e9b3a0f7946adccd6eb5e: 01/24/10 18:24:20 caller=uid=500@ids1.my.net agent=iptables action=block 01/24/10 18:24:20 dev1.my.net> d53a8306f20e9b3a0f7946adccd6eb5e: {:ipaddr=>"114.255.136.120"} 01/24/10 18:24:20 dev2.my.net> d53a8306f20e9b3a0f7946adccd6eb5e: 01/24/10 18:24:20 caller=uid=500@ids1.my.net agent=iptables action=block 01/24/10 18:24:20 dev2.my.net> d53a8306f20e9b3a0f7946adccd6eb5e: {:ipaddr=>"114.255.136.120"} 01/24/10 18:24:20 dev3.my.net> d53a8306f20e9b3a0f7946adccd6eb5e: 01/24/10 18:24:20 caller=uid=500@ids1.my.net agent=iptables action=block 01/24/10 18:24:20 dev3.my.net> d53a8306f20e9b3a0f7946adccd6eb5e: {:ipaddr=>"114.255.136.120"}
Here we see 3 nodes that got a request to add 114.255.136.120 to their local firewall. The request was sent by UID 500 on the machine ids1.my.net. The request is of course the same everywhere so the request id is the same on every node, the log shows agent and all parameters passed.
MCollective 0.4.2 released
Just a quick blog post for those who follow me here to get notified about new releases of MCollective. I just released version 0.4.2 which brings in big improvements for Debian packages, some tweaks to command line and a bug fix in SimpleRPC.
Read all about it at the Release Notes
MCollective Release 0.4.x
A few days ago I released Marionette Collective version 0.4.0 and today I released 0.4.1. This release branch introduce a major new feature called Simple RPC.
In prior releases it took quite a bit of ruby knowledge to write a agent and client. In addition clients all ended up implementing their own little protocols for data exchange. We've simplified agents and clients and we've created a standard protocol between clients and agents.
Standard protocols between clients and agents means we have a standard one-size-fits-all client program called mc-rpc and it opens the door to writing simple web interfaces that can talk to all compliant agents. We've made a test REST <-> Simple RPC bridge as an example.
Writing a client can now be done without all the earlier setup, command line parsing and so forth, it can now be as simple as:
require 'mcollective' include MCollective::RPC mc = rpcclient("rpctest") printrpc mc.echo(:msg => "Welcome to MCollective Simple RPC") printrpcstats
This simple client has full discovery, full --help output, and takes care of printing results and stats in a uniform way.
This should make it much easier to write more complex agents, like deployers that interact with packages, firewalls and services all in a single simple script.
We've taken a better approach in presenting the output from clients now, instead of listing 1000 OKs on success it will now only print whats failing.
Output from above client would look something along these lines:
$ hello.rb * [ ============================================================> ] 43 / 43 Finished processing 43 / 43 hosts in 392.60 ms
As you can see we have a nice progress indicator that will work for 1 or 1000 nodes, you can still see status of every reply by just running the client in verbose - which will also add more detailed stats at the end.
Agents are also much easier, here's a echo agent:
class Rpctest<RPC::Agent def echo_action validate :msg, :shellsafe reply.data = request[:msg] end end
You can get full information on this new feature here. We've also created a lot of new wiki docs about ActiveMQ setup for use with MCollective and we've recorded a new introduction video here.
MCollective Simple RPC
MCollective is a framework for writing RPC style tools that talk to a cloud of servers, till now doing that has been surprisingly hard for non ruby coders. The reason for this is that I was focussing on getting the framework built and feeling my way around the use cases.
I've now spent 2 days working on simplifying actually writing agents and consumers. This code is not released yet - just in SVN trunk - but here's a taster.
First writing an agent should be simple, here's a simple 'echo' server that takes a message as input and returns it back.
class Rpctest<RPC::Agent # Basic echo server def echo_action(request, reply) raise MissingRPCData, "please supply a :msg" unless request.include?(:msg) reply.data = request[:msg] end end
This creates an echo action, does a quick check that a message was received and sends it back. I want to create a few more validators so you can check easily if the data passed to you is sane and secure if you're doing anything like system() calls with it.
Here's the client code that calls the echo server 4 times:
#!/usr/bin/ruby require 'mcollective' include MCollective::RPC rpctest = rpcclient("rpctest") puts "Normal echo output, non verbose, shouldn't produce any output:" printrpc rpctest.echo(:msg => "hello world") puts "Flattened echo output, think combined 'mailq' usecase:" printrpc rpctest.echo(:msg => "hello world"), :flatten => true puts "Forced verbose output, if you always want to see every result" printrpc rpctest.echo(:msg => "hello world"), :verbose => true puts "Did not specify needed input:" printrpc rpctest.echo
This client supports full discovery and all the usual stuff, has pretty --help output and everything else you'd expect in the clients I've supplied with the core mcollective. It caches discovery results so above code will do one discovery only and reuse it for the other calls to the collective.
When running you'll see a twirling status indicator, something like:
- [5 / 10]
This will give you a nice non scrolling indicator of progress and should work well for 100s of machines without spamming you with noise, at the end of the run you'll get the output.
The printrpc helper function tries its best to print output for you in a way that makes sense on large amounts of machines.
- By default it doesn't print stuff that succeeds, you do get a overall progress indicator though
- If anything does go wrong, useful information gets printed but only for hosts that had problems
- If you ran the client with --verbose, or forced it to verbose mode output you'll get a full bit of info of the result from every server.
- It supports flags to modify the output, you can flatten the output so hostnames etc aren't showed, just a concat of the data.
The script above gives the following output when run in non-verbose mode:
$ rpctest.rb --with-class /devel/ Normal echo output, non verbose, shouldn't produce any output: Forced verbose output, if you always want to see every result: dev1.your.com : OK "hello world" dev2.your.com : OK "hello world" dev3.your.com : OK "hello world" Flattened echo output, think combined 'mailq' usecase: hello world hello world hello world Did not specify needed input: dev1.your.com : please supply a :msg dev2.your.com : please supply a :msg dev3.your.com : please supply a :msg
Still some work to do, specifically stats needs a rethink in a scenario where you are making many calls such as in this script.
This will be in mcollective version 0.4 hopefully out early January 2010
Exim, MCollective and speed
Usually when I describe mcollective to someone they generally think its nice and all but the infrastructure to install is quite a bit and so ssh parallel tools like cap seems a better choice. They like the discovery and stuff but it's not all that clear.
I have a different end-game in mind than just restarting services, and I've made a video to show just how I manage a cluster of Exim servers using mcollective. This video should give you some ideas about the possibilities that the architecture I chose brings to the table and just what it can enable.
While watching the video please note how quick and interactive everything is, then keep in mind the following while you are seeing the dialog driven app:
- I am logged in via SSH from UK to Germany into a little VM there
- The mcollective client talks to a Germany based ActiveMQ
- The 4 mail servers in the 2nd part of the demo are based 2 x US, 1 x UK and 1 x DE
- I have ActiveMQ instances in each of the above countries clustered together using the technique previous documented here.
Here's the video then, as before I suggest you hit the full screen link and watch it that way to see what's going on.
This is the end game, I want a framework to enable this kind of tool on Unix CLI - complete with pipes as you'd expect - things like the dialog interface you see here, on the web, in general shell scripts and in nagios checks like with cucumber-nagios, all sharing a API and all talking to a collective of servers as if they are one. I want to make building these apps easy, quick and fun.
MCollective Release 0.2.0
I am pleased to announce the the first actual numbered release of The Marionette Collective, you can grab it from the downloads page.
Till now people wanting to test this had to pull out of SVN directly, I put off doing a release till I had most of the major tick boxes in my mind ticked and till I knew I wouldn't be making any major changes to the various plugins and such. This release is 0.2.x since 0.1.x was the release number I used locally for my own testing.
This being the first release I fully anticipate some problems and weirdness, please send any concerns to the mailing list or ticketing system.
This has been a while coming, I've posted lots on this blog about mcollective, what it is and what it does. For those just joining you want to watch the video on this post for some background.
I am keen to get feedback from some testers, specifically keen to hear thoughts around these points:
- How does the client tools behave on 100s of nodes, I suspect the output format might be useless if it just scrolls and scrolls, I have some ideas about this but need feedback.
- On large amount of hosts, or when doing lots of requests soon after each other, do you notice any replies going missing.
- Feed back about the general design principals, especially how you find the plugin system and what else you might want pluggable. I for example want to make it much easier to add new discovery methods.
- Anything else you can think of
I'll be putting in tickets on the issue system for future features / fixes I am adding so you can track there to get a feel for the milestones toward 0.3.x.
Thanks goes to the countless people who I spoke to in person, on IRC and on Twitter, thanks for all the retweets and general good wishes. Special thanks also to Chris Read who made the debian package code and fixed up the RC script to be LSB compliant.
Ruby Plugin Architectures
Most of the applications I write in Ruby are some kind of Framework, ruby-pdns takes plugins, mcollective takes plugins, my nagios notification bot takes plugins etc, yet I have not yet figured out a decent approach to handling plugins.
Google suggests many options, the most suggested one is something along these lines.
class Plugin def self.inherited(klass) PluginManager << klass.new end end class FooPlugin<Plugin end
Where PluginManager is some class or module that stores and later allows retrieval, when the FooPlugin class gets created it will trigger the hook in the base class.
This works ok, almost perfectly, except that at the time of the trigger the FooPlugin class is not 100% complete and your constructor will not be called, quite a pain. From what I can tell it calls the constructor on either Class or Object.
I ended up tweaking the pattern a bit and now have something that works well, essentially if you pass a String to the PluginManager it will just store that as a class name and later create you an instance of that class, else if it's not a string it will save it as a fully realized class assuming that you know what you did.
The full class is part of mcollective and you can see the source here but below the short version:
I am quite annoyed that including a module does not also include static methods in Ruby, its quite a huge miss feature in my view and there are discussions about changing that behavior. I had hopes of writing something simple that I can just do include Pluggable and this would set up all the various bits, create the inherited hook etc, but it's proven to be a pain and would be littered with nasty evals etc.
module PluginManager @plugins = {} def self.<<(plugin) type = plugin[:type] klass = plugin[:class] raise("Plugin #{type} already loaded") if @plugins.include?(type) if klass.is_a?(String) @plugins[type] = {:loadtime => Time.now, :class => klass, :instance => nil} else @plugins[type] = {:loadtime => Time.now, :class => klass.class, :instance => klass} end end def self.[](plugin) raise("No plugin #{plugin} defined") unless @plugins.include?(plugin) # Create an instance of the class if one hasn't been done before if @plugins[plugin][:instance] == nil begin klass = @plugins[plugin][:class] @plugins[plugin][:instance] = eval("#{klass}.new") rescue Exception => e raise("Could not create instance of plugin #{plugin}: #{e}") end end @plugins[plugin][:instance] end end class Plugin def self.inherited(klass) PluginManager << {:type => "facts_plugin", :class => klass.to_s} end end class FooPlugin<Plugin end
For mcollective I only ever allow one of a specific type of plugin so the code is a bit specific in that regard.
I think late creating the plugin instances is quite an improvement too since often you're loading in plugins that you just don't need like client apps would probably not need a few of the stuff I load in and creating instances is just a waste.
I am not 100% sold on this approach as the right one, I think I'll probably refine it more and would love to hear what other people have done.
This has though removed a whole chunk of grim code from mcollective since I now store all plugins and agents in here and just fetch them as needed. So already this is an improvement to what I had before so I guess it works well and should be easier to refactor for improvements now.
Managing puppetd with mcollective
It's typical during maintenance windows that you would want to disable puppet, do your work, enable again and do a run. Or perhaps you don't run puppet all the time, you just want to kick it off during your maintenance window. Doing this with ssh for loops is slow and annoying, here's a way to target large sums of machines for these actions using mcollective.
Using mcollective's discovery features and a suitable agent this is really easy, I've written such an agent and made it available on the mcollective-plugins site.
You can see below a sample session with it. In all of the examples below we're constraining it to hosts with the roles::dev_server puppet class using mcollective discovery. Not shown here is that you can get status as well as use the splay options provided by puppet, see the wiki page for details on that.
First we'll make sure it's enabled.
$ mc-puppetd --with-class roles::dev_server enable Determining the amount of hosts matching filter for 2 seconds .... 1 . Finished processing 1 / 1 hosts in 9.81 ms
Now we'll disable it
$ mc-puppetd --with-class roles::dev_server disable Determining the amount of hosts matching filter for 2 seconds .... 1 . Finished processing 1 / 1 hosts in 3252.13 ms
We'll attempt a runonce, this should fail because we just disabled the agent.
$ mc-puppetd --with-class roles::dev_server runonce -v Determining the amount of hosts matching filter for 2 seconds .... 1 dev1.your.net status=false Lock file exists ---- puppetd agent stats ---- Nodes: 1 / 1 Start Time: Sun Nov 29 23:02:30 +0000 2009 Discovery Time: 2006.38ms Agent Time: 47.62ms Total Time: 2054.00ms
Let's enable it and then try to run again.
$ mc-puppetd --with-class roles::dev_server enable Determining the amount of hosts matching filter for 2 seconds .... 1 . Finished processing 1 / 1 hosts in 9.81 ms $ mc-puppetd --with-class roles::dev_server runonce Determining the amount of hosts matching filter for 2 seconds .... 1 . Finished processing 1 / 1 hosts in 2801.82 ms
I think this is a good way to orchestrate these type of maintenance window and I hope someone finds it useful.

