Select Page
NOTE: This is a static archive of an old blog, no interactions like search or categories are current.

This ia a post in a series of posts I am doing about MCollective 2.0 and later.

In my previous post I covered a new syntax for composing discovery queries and right at the end touched on a data plugin system, today I’ll cover those in detail and show you how to write and use such a plugin.

Usage and Overview

These plugins allow you to query any data available on your nodes. Examples might be stat() information for a file, sysctl settings, Augeas matches – really anything you could potentially interact with from Ruby that exist on your managed nodes can be used as discovery data. You can write your own and distribute it and we ship a few with MCollective.

I’ll jump right in with an example of using these plugins:

$ mco service restart httpd -S "/apache/ and fstat('/etc/rsyslog.conf').md5 = /51b08b8/"

Here we’re using the -S discovery statement so we have full boolean matching. We match machines with the apache class applied and then do a regular expression match over the MD5 string of the /etc/rsyslog.conf file, any machines with both conditions met are discovered and apache is restarted.

The fstat plugin ships with MCollective 2.1.0 and newer ready to use, we can have a look at our available plugins:

$ mco plugin doc
Data Queries:
  agent           Meta data about installed MColletive Agents
  augeas_match    Augeas match lookups
  fstat           Retrieve file stat data for a given file
  resource        Information about Puppet managed resources
  sysctl          Retrieve values for a given sysctl

And we can get information about one of these plugins, lets look at the agent one:

$ mco plugin doc agent
Meta data about installed MColletive Agents
      Author: R.I.Pienaar <>
     Version: 1.0
     License: ASL 2.0
     Timeout: 1
   Home Page:
              Description: Valid agent name
                   Prompt: Agent Name
                     Type: string
               Validation: (?-mix:^[\w\_]+$)
                   Length: 20
              Description: Agent author
               Display As: Author
              Description: Agent description
               Display As: Description
              Description: Agent license
               Display As: License
              Description: Agent timeout
               Display As: Timeout
              Description: Agent url
               Display As: Url
              Description: Agent version
               Display As: Version

This shows what the query is that this plugin is expecting and what data it returns, so we can use this to discover all machines with version 1.6 of a specific MCollective agent:

$ mco find -S "agent('puppetd').version = 1.6"

And if you’re curious what exactly a plugin would return you can quickly find out using the rpcutil agent:

% mco rpc rpcutil get_data query=puppetd source=agent                                
         agent: puppetd
        author: R.I.Pienaar
   description: Run puppet agent, get its status, and enable/disable it
       license: Apache License 2.0
       timeout: 20
       version: 1.6

Writing your own plugin

Lets look at writing a plugin. We’re going to write one that can query a Linux sysctl value and let you discover against that. We’ll want this plugin only to activate on machines where /sbin/sysctl exist.

When we’re done we want to be able to do discovery like:

% mco service restart iptables -S "sysctl('net.ipv4.conf.all.forwarding').value=1"

To restart iptables on all machines with that specific sysctl enabled. Additionally we’d be able to use this plugin in any of our agents:

action "query" do
   reply[:value] = Data.sysctl(request[:sysctl_name]).value

So these plugins really are nicely contained reusable bits of data retrieval logic shareable between discovery, agents and clients.

This is the code for our plugin:

module MCollective; module Data
  class Sysctl_data<Base
    activate_when { File.exist?("/sbin/sysctl") }
    query do |sysctl|
      out = %x{/sbin/sysctl #{sysctl}}
      if $?.exitstatus == 0
        value = out.chomp.split(/\s*=\s*/)[1]
        if value
          value = Integer(value) if value =~ /^\d+$/
          value = Float(value) if value =~ /^\d+\.\d+$/
          result[:value] = value

These plugins have to be called Something_data and they go in the libdir called data/something_data.rb.

On line 3 we use the activate_when helper to ensure we don’t enable this plugin on machines without sysctl. The same confinement system as you might have seen in Agents.

Lines 5 to 18 we run the sysctl command and do some quick and dirty parsing of the result ensuring we return Integers and Floats so that numeric comparison works fine on the CLI.

You’d think we need to do some input validation here to avoid bogus data or shell injection but below you will see that the DDL defines validation and MCollective will validate the input for you prior to invoking your code. This validation happens on both the server and the client. DDL files also help us generate the documentation you saw above, native OS packages and in some cases command line completion and web UI generation.

The DDL for this plugin would be:

metadata    :name        => "Sysctl values",
            :description => "Retrieve values for a given sysctl",
            :author      => "R.I.Pienaar <>",
            :license     => "ASL 2.0",
            :version     => "1.0",
            :url         => "",
            :timeout     => 1
dataquery :description => "Sysctl values" do
    input :query,
          :prompt      => "Variable Name",
          :description => "Valid Variable Name",
          :type        => :string,
          :validation  => /\A[\w\-\.]+\z/,
          :maxlength   => 120
    output :value,
           :description => "Kernel Parameter Value",
           :display_as  => "Value"

This stuff is pretty normal anyone who has written any MCollective agents would have seen these and the input, output and metadata formats are identical. The timeout is quite important if your plugin is doing something like talking to Augeas then set this timeout to a longer period, the client when doing discovery will wait an appropriate period of time based on these timeouts.

With the DDL deployed to both the server and the client you can be sure people won’t be sending you nasty shell injection attacks and if someone accidentally tries to access a non existing return they’d get an error before sending traffic over the network.

You’re now ready to package up this plugin we support creating RPMs and Debs of mcollective plugins:

% ls data
sysctl_data.ddl  sysctl_data.rb
% mco plugin package
Created package mcollective-sysctl-values-data
% ls -l
-rw-rw-r-- 1 rip rip 2705 Jun 30 10:05 mcollective-sysctl-values-data-1.0-1.noarch.rpm
% rpm -qip mcollective-sysctl-values-data-1.0-1.noarch.rpm
Name        : mcollective-sysctl-values-data  Relocations: (not relocatable)
Version     : 1.0                               Vendor: Puppet Labs
Release     : 1                             Build Date: Sat 30 Jun 2012 10:05:24 AM BST
Install Date: (not installed)               Build Host:
Group       : System Tools                  Source RPM: mcollective-sysctl-values-data-1.0-1.src.rpm
Size        : 1234                             License: ASL 2.0
Signature   : (none)
Packager    : R.I.Pienaar <>
URL         :
Summary     : Retrieve values for a given sysctl
Description :
Retrieve values for a given sysctl

Install this RPM on all your machines and you’re ready to use your plugin. The version and meta data like author and license in the RPM comes from the DDL file.


This is the second of a trio of new discovery features that massively revamped the capabilities of MCollective discovery.

Discovery used to be limited to only CM Classes, Facts and Identities now the possibilities are endless as far as data residing on the nodes go. This is only available in the current development series – 2.1.x – but I hope this one will be short and we’ll get these features into the production supported code base soon.

In the next post I’ll cover discovering against arbitrary client side data – this was arbitrary server side data.