Test Driven Deployment – mcollective, puppet, cucumber

11/06/2009

With the release of mcollective recently I’ve been able to work a bit on a deploy problem I’ve had at a client, I was able to build up the following by combining mcollective, cucumber and the open source mcollective plugins.

The cucumber exploring is of course a result of @auxesis‘s brilliant cucumber talk at Devops Days recently.

Note: I’ve updated this from the initial posting, showing how I do filtering with mcollective discovery and put it all into one scenario.

Feature: Update the production systems
 
    Background:
        Given the load balancer has ip address 192.168.1.1
        And I want to update hosts with class roles::dev_server
        And I want to update hosts with fact country=de 
        And I want to pre-discover how many hosts to update
 
    Scenario: Update the website
        When I block the load balancer
        Then traffic from the load balancer should be blocked
 
        When I update the package mywebapp
        Then the package version for mywebapp should be 4.2.6-3.el5
 
        When I unblock the load balancer
        Then traffic from the load balancer should be unblocked

This is completely like any other test driven scenario based system, if it fails to block the firewall deploy will bail out. If it fails to update the package it will bail and finally only if those worked will it unblock the firewall.

Thanks to mcollective this is distributed and parallel over large numbers of machines. I can also apply filters to update just certain clusters using mcollective’s discovery features.

Everything’s outcome is tested and cucumber will only show the all clear when everything worked on all machines in a consistent way.

This is made possible in part because the mcollective plugins use the Puppet providers underneath the hood, so package and service actions are complete idempotent and repeatable, I can rerun this script 100 times and it will do the same thing.

I have other steps not included here to keep things simple but in a real world I would restart the webserver after the update and I would then call NRPE plugins on all the nodes to make sure their load average is in acceptable ranges before the firewall gets opened letting the load balancer in.

This opens up a whole lot of interesting ideas, kudos to @auxesis and his great talk at devopsdays!