by R.I. Pienaar | Sep 14, 2010 | Uncategorized
NOTE: As of version 2.6.1 of Puppet this function is part of the core functionality provided from Puppet Labs.
I wrote a data store for puppet called extlookup and blogged about it before.  With the release of Puppet 2.6.1 today extlookup is now fully integrated upstream and the code is owned by Puppet Labs.
Bug reports and so forth should go to their bugtracker and full documentation for the function now exist in the main project docs.
Very happy about this, looking forward to YAML and JSON support being added in the near future.
I’ve also just tested Puppet 2.6.1 on a number of my machines and so far no show stoppers and the basics all work with MCollective still.  I’ll do some more thorough testing soon.
				
					 
			
					
				
															
					
					 by R.I. Pienaar | Sep 13, 2010 | Uncategorized
I’ll be visiting the US later this month.  I’ll be in San Francisco on and off from the 24th of September to 20th October and the rest of the time I’ll be in Portland.
I’ll be attending Puppet Camp and giving a talk about Marionette Collective for Puppet Users.
It would be excellent to meet a lot of people I speak with on IRC and Twitter while there and if you’re a Puppet user in the area you must come to Puppet Camp, I’ve been to the EU one and can totally recommend it.
				
					 
			
					
				
															
					
					 by R.I. Pienaar | Aug 28, 2010 | Uncategorized
Last night I had a bit of a mental dump on twitter about structured data and non structured data when communicating with a cluster or servers – Twitter fails at this kind of stuff so figured I’ll follow up with a blog post.  
I started off asking for a list of tools in the cluster admin space and got some great pointers which I am reproducing here:
fabric, cap, func, clusterssh, sshpt, pssh, massh, clustershell, controltier, rash (related), dsh, chef knife ssh, pdsh+dshbak and of course mcollective.  I was also sent a list of ssh related tools which is awesome.
The point I feel needs to be made is that in general these tools just run commands on remote servers.  They are not aware of the commands output structure, what denotes pass or fail in the context of the command etc.  Basically the commands people run are commands designed for ages to be looked at by human eyes and then parsed by a human mind.  Yes they are easy to pipe and grep and chop up, but ultimately it was always designed to be run on one server at a time.
The parallel ssh’ers run these commands in parallel and you tend to get a mash of output.  The output is mixed STDOUT and STDERR and often output from different machines are multiplexed into each other so you get a stream of text that is hard to decipher even on 2 machines, not to mention 200 at once.
Take as an example a simple yum command to install a package:
| % yum install zsh
Loaded plugins: fastestmirror, priorities, protectbase, security
Loading mirror speeds from cached hostfile
372 packages excluded due to repository priority protections
0 packages excluded due to repository protections
Setting up Install Process
Package zsh-4.2.6-3.el5.i386 already installed and latest version
Nothing to do | 
% yum install zsh
Loaded plugins: fastestmirror, priorities, protectbase, security
Loading mirror speeds from cached hostfile
372 packages excluded due to repository priority protections
0 packages excluded due to repository protections
Setting up Install Process
Package zsh-4.2.6-3.el5.i386 already installed and latest version
Nothing to do
 
When run on one machine you pretty much immediately know whats going on, package was already there so nothing got done, now lets see cap invoke:
| # cap invoke COMMAND="yum -y install zsh"
  * executing `invoke'
  * executing "yum -y install zsh"
    servers: ["web1", "web2", "web3"]
    [web2] executing command
    [web1] executing command
    [web3] executing command
 ** [out :: web2] Loaded plugins: fastestmirror, priorities, protectbase, security
 ** [out :: web2] Loading mirror speeds from cached hostfile
 ** [out :: web3] Loaded plugins: fastestmirror, priorities, protectbase
 ** [out :: web3] Loading mirror speeds from cached hostfile
 ** [out :: web3] 495 packages excluded due to repository priority protections
 ** [out :: web2] 495 packages excluded due to repository priority protections
 ** [out :: web3] 0 packages excluded due to repository protections
 ** [out :: web3] Setting up Install Process
 ** [out :: web2] 0 packages excluded due to repository protections
 ** [out :: web2] Setting up Install Process
 ** [out :: web1] Loaded plugins: fastestmirror, priorities, protectbase
 ** [out :: web3] Package zsh-4.2.6-3.el5.x86_64 already installed and latest version
 ** [out :: web3] Nothing to do
 ** [out :: web1] Loading mirror speeds from cached hostfile
 ** [out :: web1] Install       1 Package(s)
 ** [out :: web2] Package zsh-4.2.6-3.el5.x86_64 already installed and latest version
 ** [out :: web2] Nothing to do
 ** [out :: web1] 548 packages excluded due to repository priority protections
 ** [out :: web1] 0 packages excluded due to repository protections
 ** [out :: web1] Setting up Install Process
 ** [out :: web1] Resolving Dependencies
 ** [out :: web1] --> Running transaction check
 ** [out :: web1] ---> Package zsh.x86_64 0:4.2.6-3.el5 set to be updated
 ** [out :: web1] --> Finished Dependency Resolution
 ** [out :: web1]
 ** [out :: web1] Dependencies Resolved
 ** [out :: web1]
 ** [out :: web1] ================================================================================
 ** [out :: web1] Package      Arch            Version                Repository            Size
 ** [out :: web1] ================================================================================
 ** [out :: web1] Installing:
 ** [out :: web1] zsh          x86_64          4.2.6-3.el5            centos-base          1.7 M
 ** [out :: web1]
 ** [out :: web1] Transaction Summary
 ** [out :: web1] ================================================================================
 ** [out :: web1] Install       1 Package(s)
 ** [out :: web1] Upgrade       0 Package(s)
 ** [out :: web1]
 ** [out :: web1] Total download size: 1.7 M
 ** [out :: web1] Downloading Packages:
 ** [out :: web1] Running rpm_check_debug
 ** [out :: web1] Running Transaction Test
 ** [out :: web1] Finished Transaction Test
 ** [out :: web1] Transaction Test Succeeded
 ** [out :: web1] Running Transaction
 ** [out :: web1] Installing     : zsh                                                      1/1
 ** [out :: web1]
 ** [out :: web1]
 ** [out :: web1] Installed:
 ** [out :: web1] zsh.x86_64 0:4.2.6-3.el5
 ** [out :: web1]
 ** [out :: web1] Complete!
    command finished
zlib(finalizer): the stream was freed prematurely.
zlib(finalizer): the stream was freed prematurely.
zlib(finalizer): the stream was freed prematurely. | 
# cap invoke COMMAND="yum -y install zsh"
  * executing `invoke'
  * executing "yum -y install zsh"
    servers: ["web1", "web2", "web3"]
    [web2] executing command
    [web1] executing command
    [web3] executing command
 ** [out :: web2] Loaded plugins: fastestmirror, priorities, protectbase, security
 ** [out :: web2] Loading mirror speeds from cached hostfile
 ** [out :: web3] Loaded plugins: fastestmirror, priorities, protectbase
 ** [out :: web3] Loading mirror speeds from cached hostfile
 ** [out :: web3] 495 packages excluded due to repository priority protections
 ** [out :: web2] 495 packages excluded due to repository priority protections
 ** [out :: web3] 0 packages excluded due to repository protections
 ** [out :: web3] Setting up Install Process
 ** [out :: web2] 0 packages excluded due to repository protections
 ** [out :: web2] Setting up Install Process
 ** [out :: web1] Loaded plugins: fastestmirror, priorities, protectbase
 ** [out :: web3] Package zsh-4.2.6-3.el5.x86_64 already installed and latest version
 ** [out :: web3] Nothing to do
 ** [out :: web1] Loading mirror speeds from cached hostfile
 ** [out :: web1] Install       1 Package(s)
 ** [out :: web2] Package zsh-4.2.6-3.el5.x86_64 already installed and latest version
 ** [out :: web2] Nothing to do
 ** [out :: web1] 548 packages excluded due to repository priority protections
 ** [out :: web1] 0 packages excluded due to repository protections
 ** [out :: web1] Setting up Install Process
 ** [out :: web1] Resolving Dependencies
 ** [out :: web1] --> Running transaction check
 ** [out :: web1] ---> Package zsh.x86_64 0:4.2.6-3.el5 set to be updated
 ** [out :: web1] --> Finished Dependency Resolution
 ** [out :: web1]
 ** [out :: web1] Dependencies Resolved
 ** [out :: web1]
 ** [out :: web1] ================================================================================
 ** [out :: web1] Package      Arch            Version                Repository            Size
 ** [out :: web1] ================================================================================
 ** [out :: web1] Installing:
 ** [out :: web1] zsh          x86_64          4.2.6-3.el5            centos-base          1.7 M
 ** [out :: web1]
 ** [out :: web1] Transaction Summary
 ** [out :: web1] ================================================================================
 ** [out :: web1] Install       1 Package(s)
 ** [out :: web1] Upgrade       0 Package(s)
 ** [out :: web1]
 ** [out :: web1] Total download size: 1.7 M
 ** [out :: web1] Downloading Packages:
 ** [out :: web1] Running rpm_check_debug
 ** [out :: web1] Running Transaction Test
 ** [out :: web1] Finished Transaction Test
 ** [out :: web1] Transaction Test Succeeded
 ** [out :: web1] Running Transaction
 ** [out :: web1] Installing     : zsh                                                      1/1
 ** [out :: web1]
 ** [out :: web1]
 ** [out :: web1] Installed:
 ** [out :: web1] zsh.x86_64 0:4.2.6-3.el5
 ** [out :: web1]
 ** [out :: web1] Complete!
    command finished
zlib(finalizer): the stream was freed prematurely.
zlib(finalizer): the stream was freed prematurely.
zlib(finalizer): the stream was freed prematurely.
 
Most of this stuff scrolled off my screen and at the end all I had was the last bit of output.  I could scroll up and still figure out ok what was going on – 2 of the 3 already had it installed, one got it.   Now imagine 100 or 500 of these machines output all mixed in?  Just parsing this output would be prone to human error and you’re likely to miss that something failed.
So here is my point, your cluster management tool need to provide an API around the every day commands like packages, process listing etc.  It should return structured data and you could use the structured data to create tools more fit for the purpose of using on large amount of machines.  Being that the output is standardized it should provide generic tools that just do the right thing out of the box for you.
With the package example above knowing that all 500 machines had spewed out a bunch of stuff while installing isn’t important, you just want to know the result in a nice way.  Here’s what mcollective does:
| $ mc-package install zsh
 
 * [ ============================================================> ] 3 / 3
 
web2.my.net                      version = zsh-4.2.6-3.el5
web3.my.net                      version = zsh-4.2.6-3.el5
web1.my.net                      version = zsh-4.2.6-3.el5
 
---- package agent summary ----
           Nodes: 3 / 3
        Versions: 3 * 4.2.6-3.el5
    Elapsed Time: 16.33 s | 
$ mc-package install zsh
 * [ ============================================================> ] 3 / 3
web2.my.net                      version = zsh-4.2.6-3.el5
web3.my.net                      version = zsh-4.2.6-3.el5
web1.my.net                      version = zsh-4.2.6-3.el5
---- package agent summary ----
           Nodes: 3 / 3
        Versions: 3 * 4.2.6-3.el5
    Elapsed Time: 16.33 s
 
In the case of a package you want to just know the version post the event and a summary of status.  Just by looking at the stats I know the desired result was achieved, if I had different versions listed I could very quickly identify the problem ones.
Here’s another example – NRPE this time:
| % mc-rpc nrpe runcommand command=check_disks
 
 * [ ============================================================> ] 47 / 47
 
 
dev1.my.net                      Request Aborted
   CRITICAL
          Exit Code: 2
   Performance Data:  /=4111MB;3706;3924;0;4361 /boot=26MB;83;88;0;98 /dev/shm=0MB;217;230;0;256
             Output: DISK CRITICAL - free space: / 24 MB (0% inode=86%);
 
 
Finished processing 47 / 47 hosts in 766.11 ms | 
% mc-rpc nrpe runcommand command=check_disks
 * [ ============================================================> ] 47 / 47
dev1.my.net                      Request Aborted
   CRITICAL
          Exit Code: 2
   Performance Data:  /=4111MB;3706;3924;0;4361 /boot=26MB;83;88;0;98 /dev/shm=0MB;217;230;0;256
             Output: DISK CRITICAL - free space: / 24 MB (0% inode=86%);
Finished processing 47 / 47 hosts in 766.11 ms
 
Here notice I didn’t use a NRPE specific mc- command, I just used the generic rpc caller and the caller knows that I am only interesting in seeing the results of machines that are in WARNING or CRITICAL state.  If you run this on your console you’d see the ‘Request Aborted’ would be red and the ‘CRITICAL’ would be yellow.   Immediately pulling your eye to the important information.  Also note how the result shows human friendly field names like ‘Performance Data’.  
The formatting, highlighting, knowledge to only show failing resources and human friendly headings all happen automatically, no programming of client side UI is required you get the ability to do this for free simply from the fact that mcollective focuses on putting structure around outputs.
Here’s the earlier package install example with the standard rpc caller not with a specialized package frontend:
| % mc-rpc package install package=zsh
Determining the amount of hosts matching filter for 2 seconds .... 47
 
 * [ ============================================================> ] 47 / 47
 
Finished processing 47 / 47 hosts in 2346.05 ms | 
% mc-rpc package install package=zsh
Determining the amount of hosts matching filter for 2 seconds .... 47
 * [ ============================================================> ] 47 / 47
Finished processing 47 / 47 hosts in 2346.05 ms
 
Everything worked, all 47 machines have the package installed and your desired action was taken.  So no point in spamming you with pages of junk, who cares to see all the Yum output?  Had an install failed you’d have had usable error message just for the host that failed.  The output would be equally usable on one or a thousand hosts with very little margin for human error in knowing the result of your request.
This happens because mcollective has a standard structure of responses, each response has a absolute success value that tells you if the request failed or not and by using this you can get generic CLI, Web, etc tools that displays large amounts of data from a network of hosts in a way that is appropriate and context aware.
For reference here’s the response as received on the client:
| {:sender=>"dev1.my.net",
 :statuscode=>1,
 :statusmsg=>"CRITICAL",
 :data=>
  {:perfdata=>
    " /=4111MB;3706;3924;0;4361 /boot=26MB;83;88;0;98 /dev/shm=0MB;217;230;0;256",
   :output=>"DISK CRITICAL - free space: / 24 MB (0% inode=86%);",
   :exitcode=>2}} | 
{:sender=>"dev1.my.net",
 :statuscode=>1,
 :statusmsg=>"CRITICAL",
 :data=>
  {:perfdata=>
    " /=4111MB;3706;3924;0;4361 /boot=26MB;83;88;0;98 /dev/shm=0MB;217;230;0;256",
   :output=>"DISK CRITICAL - free space: / 24 MB (0% inode=86%);",
   :exitcode=>2}}
 
Only by thinking about CLI and admin tasks in this way do I believe we can take the Unix utilities that we call on remote hosts and turn them into something appropriate for large scale parallel use that doesn’t overwhelm the human at the other end with information.  Additionally since this is an API that is computer friendly it makes those tools usable in many other places like code deployers – for example to enable your continues deployment using robust use of unix tools via such an API.
There are many other advantages to this approach.  Requests are authorized on a very fine level, requests are audited.  API wrappers are code that’s versioned, that can be tested in development and makes the margin for error much smaller than just running random unix commands ad hoc.  Finally if you’re using the code on a CLI ad-hoc as above or in your continues deployer you share the same code that you’ve already tested and trust.
				
					 
			
					
				
															
					
					 by R.I. Pienaar | Aug 20, 2010 | Uncategorized
I just released version 0.4.8 of mcollective.  It’s a small maintenance release fixing a few bugs and adding a few features.  I wasn’t planning on another 0.4.x release before the big 1.0.0 but want to keep 1.0.0 close as possible to something that’s been out there for a while.
The only major feature it introduces is custom reports of your infrastructure.
It supports two types of scriptlet for building reports.  The first is a little DSL that uses printf style format strings:
| inventory do
    format "%s:\t\t%s\t\t%s"
 
    fields { [ identity, facts["serialnumber"], facts["productname"] ] }
end | 
inventory do
    format "%s:\t\t%s\t\t%s"
 
    fields { [ identity, facts["serialnumber"], facts["productname"] ] }
end
 
Which does something like this:
| $ mc-inventory --script hardware.mc
web1:           KKxxx1H         IBM eServer BladeCenter HS20 -[8832M1X]-
rep1:           KKxxx5Z         IBM eServer BladeCenter HS20 -[8832M1X]-
db4:            KDxxxZY         IBM System x3655 -[794334G]-
man2:           KDxxxR0         eserver xSeries 336 -[88372CY]-
db2:            KDxxxGD         IBM System x3655 -[79855AG]- | 
$ mc-inventory --script hardware.mc
web1:           KKxxx1H         IBM eServer BladeCenter HS20 -[8832M1X]-
rep1:           KKxxx5Z         IBM eServer BladeCenter HS20 -[8832M1X]-
db4:            KDxxxZY         IBM System x3655 -[794334G]-
man2:           KDxxxR0         eserver xSeries 336 -[88372CY]-
db2:            KDxxxGD         IBM System x3655 -[79855AG]-
 
The other – perhaps more ugly – is using a Perl like format method.  To use this you need the formatr gem installed, and a report might look like this:
| formatted_inventory do
    page_length 20
 
    page_heading <<TOP
 
            Node Report @<<<<<<<<<<<<<<<<<<<<<<<<<
                        time
 
Hostname:         Customer:     Distribution:
-------------------------------------------------------------------------
TOP
 
    page_body <<BODY
 
@<<<<<<<<<<<<<<<< @<<<<<<<<<<<< @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
identity,    facts["customer"], facts["lsbdistdescription"]
                                @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
                                facts["processor0"]
BODY
end | 
formatted_inventory do
    page_length 20
    page_heading <<TOP
            Node Report @<<<<<<<<<<<<<<<<<<<<<<<<<
                        time
Hostname:         Customer:     Distribution:
-------------------------------------------------------------------------
TOP
    page_body <<BODY
@<<<<<<<<<<<<<<<< @<<<<<<<<<<<< @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
identity,    facts["customer"], facts["lsbdistdescription"]
                                @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
                                facts["processor0"]
BODY
end
 
And the resulting report is something like this:
| $ mc-inventory --script hardware.mc
            Node Report Fri Aug 20 21:49:39 +0100
 
Hostname:         Customer:     Distribution:
-------------------------------------------------------------------------
 
web1              rip           CentOS release 5.5 (Final)
                                Intel(R) Xeon(R) CPU           L5420  
 
web2              xxxxxxx       CentOS release 5.5 (Final)
                                Intel(R) Xeon(R) CPU           X3430 | 
$ mc-inventory --script hardware.mc
            Node Report Fri Aug 20 21:49:39 +0100
Hostname:         Customer:     Distribution:
-------------------------------------------------------------------------
web1              rip           CentOS release 5.5 (Final)
                                Intel(R) Xeon(R) CPU           L5420  
web2              xxxxxxx       CentOS release 5.5 (Final)
                                Intel(R) Xeon(R) CPU           X3430
 
The report will be paged 20 nodes per page.  The result is very pleasing even if the report format is a bit grim, but it would be much worse to write yet another reporting DSL!
See the full release notes for details on bug fixes and other features.
				
					 
			
					
				
															
					
					 by R.I. Pienaar | Aug 7, 2010 | Code
I often get asked about MCollective and other programming languages.  Thus far we only support Ruby but my hope is in time we’ll be able to be more generic.
Initially I had a few requirements from serialization:
- It must retain data types
- Encoding the same data – like a hash – twice should give the same result from the point of view of md5()
That was about it really.  This was while we used a pre-shared key to validate requests and so the result of the encode and decode should be the same on the sender as on the receiver.  With YAML this was never the case so I used Marshal.
We recently had a SSL based security plugin contributed that relaxed the 2nd requirement so we can go back to using YAML.   We could in theory relax the 1st requirement but it would just inhibit the kind of tools you can build with MCollective quite a bit.  So I’d strongly suggest this is a must have.
Today there are very few cross language serializers that let you just deal with arbitrary data YAML is one that seems to have a log of language support.  Prior to version 1.0.0 of MCollective the SSL security system only supported Marshal but we’ll support YAML in addition to Marshal in 1.0.0.
This enabled me to write a Perl client that speaks to your standard Ruby collective (if it runs this new plugin).
You can see the Perl client here.  The Perl code is roughly a mc-find-hosts written in Perl and without command line options for filtering – though you can just adjust the filters in the code.  It’s been years since I wrote any Perl so that’s just the first thing that worked for me.  
Point is someone should be able to take any language that has the Syck YAML bindings and write a client library to talk with Mcollective.  I tried the non Syck bindings in PHP and it’s unusable, I suspect the PHP Syck bindings will work better but I didn’t try them.
As mentioned on the user list post 1.0.0 I intend to focus on long running and scheduled requests I’ll then also work on some kind of interface between Mcollective and Agents written in other languages – since that is more or less how long running scheduled tasks would work anyway.  This will then use the Ruby as a transport hooking clients and agents in different languages together.  
I can see that I’ll enable this but I am very unlikely to write the clients myself.  I am therefore keen to speak to community members who want to speak to MCollective from languages like Python and who have some time to work on this.