Select Page

An update on my Choria project

Some time ago I mentioned that I am working on improving the MCollective Deployment story.

I started a project called Choria that aimed to massively improve the deployment UX and yield a secure and stable MCollective setup for those using Puppet 4.

The aim is to make installation quick and secure, towards that it seems a common end to end install from scratch by someone new to project using a clustered NATS setup can take less than a hour, this is a huge improvement.

Further I’ve had really good user feedback, especially around NATS. One user reports 2000 nodes on a single NATS server consuming 300MB RAM and it being very performant, much more so than the previous setup.

It’s been a few months, this is whats changed:

  • The module now supports every OS AIO Puppet supports, including Windows.
  • Documentation is available on choria.io, installation should take about a hour max.
  • The PQL language can now be used to do completely custom infrastructure discovery against PuppetDB.
  • Many bugs have been fixed, many things have been streamlined and made more easy to get going with better defaults.
  • Event Machine is not needed anymore.
  • A number of POC projects have been done to flesh out next steps, things like a very capable playbook system and a revisit to the generic RPC client, these are on GitHub issues.

Meanwhile I am still trying to get to a point where I can take over maintenance of MCollective again, at first Puppet Inc was very open to the idea but I am afraid it’s been 7 months and it’s getting nowhere, calls for cooperation are just being ignored. Unfortunately I think we’re getting pretty close to a fork being the only productive next step.

For now though, I’d say the Choria plugin set is production ready and stable any one using Puppet 4 AIO should consider using these – it’s about the only working way to get MCollective on FOSS Puppet now due to the state of the other installation options.

Fixing the mcollective deployment story

Getting started with MCollective has always been an adventure, you have to learn a ton of new stuff like Middleware etc. And once you get that going the docs tend to present you with a vast array of options and choices including such arcane topics like which security plugin to use while the security model chosen is entirely unique to mcollective. To get a true feeling for the horror see the official deployment guide.

This is not really a pleasant experience and probably results in many insecure or half build deployments out there – and most people just not bothering. This is of course entirely my fault, too many options with bad defaults chosen is to blame.

I saw the graph of the learning curve of Eve Online and immediately always think of mcollective ๐Ÿ™‚ Hint: mcollective is not the WoW of orchestration tools.

I am in the process of moving my machines to Puppet 4 and the old deployment methods for MCollective just did not work, everything is falling apart under the neglect the project has been experiencing. You can’t even install any plugin packages on Debian as they will nuke your entire Puppet install etc.

So I figured why not take a stab at rethinking this whole thing and see what I can do, today I’ll present the outcome of that – a new Beta distribution of MCollective tailored to the Puppet 4 AIO packaging that’s very easy to get going securely.

Overview


My main goals with these plugins were that they share as much security infrastructure with Puppet as possible. This means we get a understandable model and do not need to mess around with custom CAs and certs and so forth. Focussing on AIO Puppet means I can have sane defaults that works for everyone out of the box with very limited config. The deployment guide should be a single short page.

For a new user who has never used MCollective and now need certificates there should be no need to write a crazy ~/.mcollective file and configure a ton of SSL stuff, they should only need to do:

$ mco choria request_cert

This will make a CSR, submit it to the PuppetCA and wait for it to be signed like Puppet Agent. Once signed they can immediately start using MCollective. No config needed. No certs to distribute. Secure by default. Works with the full AAA stack by default.

Sites may wish to have tighter than default security around what actions can be made, and deploying these policies should be trivial.

Introducing Choria


Choria is a suite of plugins developed specifically with the Puppet AIO user in mind. It rewards using Puppet as designed with defaults and can yield a near zero configuration setup. It combines with a new mcollective module used to configure AIO based MCollective.

The deployment guide for a Choria based MCollective is a single short page. The result is:

  • A Security Plugin that uses the Puppet CA
  • A connector for NATS
  • A discovery cache that queries PuppetDB using the new PQL language
  • A open source Application Orchestrator for the new Puppet Multi Node Application stuff (naming is apparently still hard)
  • Puppet Agent, Package Agent, Service Agent, File Manager Agent all setup and ready to use
  • SSL and TLS used everywhere, any packet that leaves a node is secure. This cannot be turned off
  • A new packager that produce Puppet Modules for your agents etc and supports every OS AIO Puppet does
  • The full Authentication, Authorization and Auditing stack set up out of the box, with default secure settings
  • Deployment scenarios works by default, extensive support for SRV records and light weight manual configuration for those with custom needs

It’s easy to configure using the new lookup system and gives you a full, secure, usable, mcollective out of the box with minimal choices to make.

You can read how to deploy it at it’s deployment guide.

Status


This is really a Beta release at the moment, I’m looking for testers and feedback. I am particularly interested in feedback on NATS and the basic deployment model, in future I might give the current connectors a same treatment with chosen defaults etc.

The internals of the security plugin is quite interesting, it proposes a new internal message structure for MCollective which should be much easier to support in other languages and is more formalised – to be clear these messages always existed, they were just a bit adhoc.

Additionally it’s the first quality security plugin that has specific support for building a quality web stack compatible MCollective REST server that’s AAA compatible and would even allow centralised RBAC and signature authority.

Introduction to MCollective deck

I’ve not had a good introduction to MCollective slide deck ever, I usually just give demos and talk through it. I was invited to talk in San Francisco about MCollective so made a new deck for this talk.

On the night I gave people the choice of talks between the new Introduction talk and the older Managing Puppet using MCollective and sadly the intro talk lost out.

Last night the excellent people at Workday flew me to Dublin to talk to the local DevOps group there and this group was predominantly Chef users who chose the Introduction talk so I finally had a chance to deliver it. This talk was recorded, hopefully it’ll be up soon and I’ll link to it once available.

This slide deck is a work in progress, it’s clear I need to add some more information about the non-cli orientated uses of MCollective but it’s good to finally have a deck that’s receiving good feedback.

We uploaded the slides back when I was in San Francisco to slideshare and those are the ones you see here.


Managing Puppet Using MCollective

I recently gave a talk titled “Managing Puppet Using MCollective” at the Puppet Camp in Ghent.

The talk introduces a complete rewrite of the MCollective plugin used to manage Puppet. The plugin can be found on our Github repo as usual. Significantly this is one of a new breed of plugin that we ship as native OS packages and practice continuous delivery on.

The packages can be found on apt.puppetlabs.com and yum.puppetlabs.com and are simply called mcollective-puppet-agent and mcollective-puppet-client.

This set of plugins show case a bunch of recent MCollective features including:

  • Data Plugins
  • Aggregation Functions
  • Custom Validators
  • Configurable enabling and disabling of the Agent
  • Direct Addressing and pluggable discovery to significantly improve the efficiency of the runall method
  • Utility classes shared amongst different types of plugin
  • Extensive testing using rspec and our mcollective specific rspec plugins

It’s a bit of a beast coming at a couple thousand lines but this was mostly because we had to invent a rather sizeable wrapper for Puppet to expose a nice API around Puppet 2.7 and 3.x for things like running them and obtaining their status.

The slides from the talk can be seen below, hopefully a video will be up soon else I’ll turn it into a screencast.

Graphing on the CLI

I’ve recently been thinking about ways to do graphs on the CLI. We’ve written a new Puppet Agent for MCollective that can gather all sorts of interesting data from your server estate and I’d really like to be able to show this data on the CLI. This post isn’t really about MCollective though the ideas applies to any data.

I already have sparklines in MCollective, here’s the distribution of ping times:

This shows you that most of the nodes responded quickly with a bit of a tail at the end being my machines in the US.

Sparklines are quite nice for a quick overview so I looked at adding some more of this to the UI and came up with this:

Which is quite nice – these are the nodes in my infrastructure stuck into buckets and the node counts for each bucket is shown. We can immediately tell something is not quite right – the config retrieval time shows a bunch of slow machines and the slowness does not correspond to resource counts etc. On investigation I found these are my dev machines – KVM nodes hosted on HP Micro Servers so that’s to be expected.

I am not particularly happy with these graphs though so am still exploring other options, one other option is GNU Plot.

GNU Plot can target its graphs for different terminals like PNG and also line printers – since the Unix terminal is essentially a line printer we can use this.

Here are 2 graphs of config retrieval time produced by MCollective using the same data source that produced the spark line above – though obviously from a different time period. Note that the axis titles and graph title is supplied automatically using the MCollective DDL:

$ mco plot resource config_retrieval_time
 
                   Information about Puppet managed resources
  Nodes
    6 ++-*****----+----------+-----------+----------+----------+----------++
      +      *    +          +           +          +          +           +
      |       *                                                            |
    5 ++      *                                                           ++
      |       *                                                            |
      |        *                                                           |
    4 ++       *      *                                                   ++
      |        *      *                                                    |
      |         *    * *                                                   |
    3 ++        *    * *                                                  ++
      |          *  *  *                                                   |
      |           * *   *                                                  |
    2 ++           *    *                         *        *              ++
      |                 *                         **       **              |
      |                  *                       * *      *  *             |
    1 ++                 *               *       *  *     *   **        * ++
      |                  *              * *     *   *     *     **    **   |
      +           +       *  +         * + *    *   +*   *     +     *     +
    0 ++----------+-------*************--+--****----+*****-----+--***-----++
      0           10         20          30         40         50          60
                              Config Retrieval Time

So this is pretty serviceable for showing this data on the console! It wouldn’t scale to many lines but for just visualizing some arbitrary series of numbers it’s quite nice. Here’s the GNU Plot script that made the text graph:

set title "Information about Puppet managed resources"
set terminal dumb 78 24
set key off
set ylabel "Nodes"
set xlabel "Config Retrieval Time"
plot '-' with lines
3 6
6 6
9 3
11 2
14 4
17 0
20 0
22 0
25 0
28 0
30 1
33 0
36 038 2
41 0
44 0
46 2
49 1
52 0
54 0
57 1

The magic here comes from the second line that sets the output terminal to dump and supplies some dimensions. Very handy, worth exploring some more and adding to your toolset for the CLI. I’ll look at writing a gem or something that supports both these modes.

There are a few other players in this space, I definitely recall coming across a Python tool to do graphs but cannot find it now, shout out in the comments if you know other approaches and I’ll add them to the post!

Updated: some links to related projects: sparkler, Graphite Spark