Using ruby mocha outside of unit testing frameworks


I find myself currently writing a lot of orchastration code that manages hardware. This is very difficult because I like doing little test.rb scripts or testing things out in irb or pry to see if APIs are comfortable to use.

The problem with hardware is in order to properly populate my objects I need to query things like the iDRACs or gather inventories from all my switches to figure out where a piece of hardware is and this take a lot of time and requires constant access to my entire lab.

Of course my code has unit tests and so all the objects that represents servers and switches etc are already designed to be somewhat comfortable to load stub data for and to be easy to mock. So I ended up using rspec as my test.rb environment of choice.

I figured there has to be a way to use mocha in a non rspec environment, and turns out there is and it’s quite easy.

The magic here is line 1 and line 5, including Mocha::API will extend Object and Class with all the stubbing and mocking methods. I’d avoid using expectations and instead use stubs in this scenario.

At this point I’d be dropped into a pry shell loaded up with the service fixture in my working directory where I can quickly interact with my faked up hardware environment. I have many such hardware captures for each combination of hardware I support – and the same data is used in my unit tests too.

Below I iterate all the servers and find their mac addresses of the primary interfaces in each partition and then find all the switches they are connected to. Behind the scenes in real life this would walk all my switches looking for the port each mac is connected to and so forth, quite a time consuming operation and would require me to dedicate this lab hardware to me. Now I can just snapshot the hardware and load up my models later and it’s really quick.

I found this incredibly handy and will be using it pretty much all the time now, so thought it worth sharing :)

Translating Webhooks with AWS API Gateway and Lambda


Webhooks are great, so many services now support them but I found actually doing anything with them a pain as there are no standards for what goes in them and any 3rd party service you wish to integrate with has to support the particular hooks you are producing.

For instance I want to use SignalFX for my metrics and events but they have very few integrations. A translator could take an incoming hook and turn it into a SignalFX event and pass it onward.

For a long time I’ve wanted to build a translator but never got around to doing it because I did not feel like self hosting it and write a whole bunch of supporting infrastructure. With the release of AWS API Gateway this has become quite easy and really convenient as there are no infrastructure or instances to manage.

I’ll show a bit of a walk through on how I built a translator that sends events to Signal FX. Note I do not do any kind of queueing or retrying on the gateway at present so it’s lossy and best efforts.

AWS Lambda runs stateless functions on demand. At launch it only supported ingesting their own Events but the recently launched API Gateway lets you front it using a REST API of your own design and this made it a lot easier.

For the rest of this post I assume you’re over the basic hurdles of signing up for AWS and are already familiar with the basics, so some stuff will be skipped but it’s not really that complex to get going.

The Code

To get going you need some JS code to handle the translation, here’s a naive method to convert a GitHub push notification into a SignalFX event:

This will be the meat of the of the processing and it includes a bit of code to create a request using the https module which includes the SignalFX authentication header.

Note this creates dimensions to the event that is being sent, I guess you can think of them like some kind of key=val tags for the event. In the Signal FX UI I can select events like this:

And any other added dimension can be used too, events shows up as little diamonds on graphs, so if I am graphing a service using these dimensions I can pick out events that relate to the branches and repositories that influence the data.

This is called as below:

There’s some stuff not shown here for brevity, it’s all in GitHub. The entry point here is handleGitHubPushNotifications, this is the Lambda function that will be run. I can put many different ones in here and in the previous code and share this same zip file across many functions. All I have to do is tell Lambda to run handleGitHubPushNotifications or handleOpsGeniePushNotifications etc. so this is a library of functions. See the next section for how.

Setting up the Lambda functions

We have to create a Lambda function, for now I’ll use the console but you can use terraform for this it helps quite a lot.

As this repo is made up of a few files your only option is to zip it up. You’ll have to clone it and make your own config.js based on the sample prior to creating the zip file.

Once you have it just create a Lambda function which I’ll call gitHubToSFX and choose your zip file as source. While setting it up you have to supply a handler. This is how Lambda finds your function to call.

In my case I specify index.handleGitHubPushNotifications – uses the handleGitHubPushNotifications function found in index.js.

It ends up looking like this:

Once created you can test it right there if you have a sample GitHub commit message.

The REST End Point

Now we need to create somewhere for GitHub to send the POST request to. Gateway works with resources and methods. A resource is something like /github-hook and a method is POST.

I’ve created the resource and method, and told it to call the Lambda function here:

You have to deploy your API – just hit the big Deploy API button and follow the steps, you can create stages like development, staging, production and deploy API’s through such a life cycle. I just went straight to prod.

Once deployed it gives you a URL like https://12344xnb.execute-api.eu-west-1.amazonaws.com/prod and your GitHub hook would be configured to hit https://12344xnb.execute-api.eu-west-1.amazonaws.com/prod/github-hook .


That’s about it, once you’ve configured GitHub you’ll start seeing events flow through.

Both Lambda and API Gateway can write logs to Cloud Watch and from the JS side you can see do something like console.log(“hello”) and this will show up in the Cloud Watch logs to help with debugging.

I hope to start gathering a lot of translations like these and am still learning Node, so not really sure yet how to make packages or classes but so far this seems really easy to use.

Cost wise it’s really cheap. You’d pay $3.50 per million API calls received on the Gateway and $0.09/GB for the transfer costs, but given the nature of these events this will be negligible. Lambda is free for the first 1 million requests and you’ll pay some tiny amount for the time used. They are both eligible for the free tier too in case you’re new to AWS.

There are many advantages to this approach:

  • It’s very cheap as there are no instances to run, just the requests
  • Adding webhooks to many services is a clickfest hell. This gives me a API that I can change the underlying logic of without updating GitHub etc
  • Today I use SignalFX but it’s event feature is pretty limited, I can move all the events elsewhere on the backend without any API changes
  • I can use my own domain and SSL certs
  • As the REST API is pretty trivial I can later move it in-house if I need, again without changing any 3rd parties – assuming I set up my own domain

I have 2 outstanding issues to address:

  • How to secure it, API Gateway supports headers as tokens but this is not something webhooks tend to support
  • Monitoring it, I do not want to some webhook sender to get in a loop and send 100s of thousands of requests without it going unnoticed

Shiny new things in Puppet 4


Puppet 4 has been out a while but given the nature of the update – new packaging requiring new modules to manage it etc I’ve been reluctant to upgrade and did not have the time really. Ditto for Centos 7. But Docker will stop supporting Centos 6 Soon Now so this meant I had to look into both a bit closer.

Puppet 4 really is a whole new thing, it maintains backward compatibility but really in terms of actually using its features I think you’d be better off just starting fresh. I am moving the bulk of my services out of CM anyway so my code base will be tiny so not a big deal for me to just throw it all out and start fresh.

I came across a few really interesting new things amongst it’s many features and wanted to highlight a few of these. This is by no means an exhaustive list, it’s just a whirlwind tour of a few things I picked up on.

The Forge

Not really a Puppet 4 thing per se but more a general eco system comment. I have 23 modules in my new freshly minted Puppet repo with 13 of them coming from the forge. To my mind that is a really impressive figure, it makes the new starter experience so much better.

Things I still do on my own: exim, iptables, motd, pki, roles/profiles of course and users.

In the case of exim I have almost no config, it’s just a package/config/service and all it does is setup a local config that talks to my smart relays. It does use my own CA though and that’s why I also have my own PKI module to configure the CA and distribute certs and keys and such. The big one is iptables really and I just haven’t had the time to really consider a replacement – whatever I choose it needs to play well with docker and that’s probably going to be a tall order.

Anyway, big kudos on the forge team and shout outs to forge users: puppetlabs, jfryman, saz and garethr.

Still some things to fix – puppet module tool is pretty grim wrt errors and feedback and I think there’s work left to do on discoverability of good modules and finding ways to promote people investing time in making better ones, but this is a big change from 2 years ago for sure.

Puppet 4 Type System

Puppet 4 has a data type system, it’s kind of optional which is weird as things go but you can almost think of it like a built in way to do validate_hash and friends, almost. The implications of having it though are huge – it means down the line there will be a lot fewer edge cases with things just behaving weirdly.

Data used to go from hiera to manifests and ending up strings when the data was Boolean now Puppet knows about actual Booleans and does not mess it up – things will become pretty consistant and solid and it will be easy to write well behaved code.

For now though it’s the opposite, there are many more edge cases as a result of it.

Particularly functions that previously took a number and did something with it might have assumed the number was a string with a number in it. Now it’s going to get an actual number and this causes breakage. There are a few of these in stdlib but they are getting fixed – expect this will catch out many templates and functions so there will be a settling in period but it’s well worth the effort.

Here’s an example:

define users::user(
  Enum["present", "absent"] $ensure       = "present",
  Optional[String] $ssh_authorized_file   = undef,
  Optional[String] $email                 = undef,
  Optional[Integer] $uid                  = undef,
  Optional[Integer] $gid                  = undef,
  Variant[Boolean, String] $sudoer        = false,
  Boolean $setup_shell                    = false,
  Boolean $setup_rbenv                    = false
) {

If I passed ensure => bob to this I get:

Error: Expected parameter 'ensure' of 'Users::User[rip]' to have type Enum['present', 'absent'], got String

Pretty handy though the errors can improve a lot – something I know is on the road map already.

You can get pretty complex with this like describe the entire contents of a hash and Puppet will just ensure any hash you receive matches this, doing this would have been really hard even with all the stuff in old stdlib:

Struct[{mode            => Enum[read, write, update],
        path            => Optional[String[1]],
        NotUndef[owner] => Optional[String[1]]}]

I suggest you spend a good amount of time with the docs About Values and Data Types, Data Types: Data Type Syntax and Abstract Data Types. There are many interesting types like ones that do Pattern matching etc.

Case statements and Selectors have also become type aware as have normal expressions to test equality etc:

$enable_real = $enable ? {
  Boolean => $enable,
  String  => str2bool($enable),
  Numeric => num2bool($enable),
  default => fail('Illegal value for $enable parameter'),
if 5 =~ Integer[1,10] {
  notice("it's a number between 1 and 10")

It’s not all wonderful though, I think the syntax choices are pretty poor. I scan parameter lists: a) to discover module features b) to remind myself of the names c) to find things to edit. With the type preceding the variable name every single use case I have for reading a module code has become worse and I fear I’ll have to resort to lots of indention to make the var names stand out from the type definitions. I cannot think of a single case where I will want to know the variable data type before knowing it’s name. So from a readability perspective this is not great at all.

Additionally I cannot see myself using a Struct like above in the argument list – to which Henrik says they are looking to add a typedef thing to the language so you can give complex Struc’s a more convenient name and use that. This will help that a lot. Something like this:

type MyData = Struct[{ .... }]
define foo(MyData $bar) {

That’ll be handy and Henrik says this is high on the priority list, it’s pretty essential from a usability perspective.

Native data merges

You can merge arrays and hashes easily:

$ puppet apply -e '$a={"a" => "b"}; $b={"c" => "d"}; notice($a+$b)'
Notice: Scope(Class[main]): {a => b, c => d}
$ puppet apply -e 'notice([1,2,3] + [4,5,6])'
Notice: Scope(Class[main]): [1, 2, 3, 4, 5, 6]

And yes you can now use a ; instead of awkwardly making new lines all the time for quick one-liner tests like this.

Resource Defaults

There’s a new way to do resource defaults. I know this is a widely loathed syntax but I quite like it:

file {
    mode   => '0600',
    owner  => 'root',
    group  => 'root',
    ensure => file,
    mode => '0644',

The specific mode on /etc/ssh_host_dsa_key.pub will override the defaults, pretty handy. And it address a previous issue with old style defaults that they would go all over the scope and make a mess of things. This is confined to just these files.

Accessing resource parameter values

This is something people often ask for, it’s seems exciting but I don’t think it will be of any practical use because it’s order dependant just like defined().

notify{"hello": message => "world"}
$message = Notify["hello"]["message"]  # would be 'world'

So this fetches another resource parameter value.

You can also fetch class parameters this way but this seems redundant, there are several ordering caveats so test your code carefully.


This doesn’t really need comment, perhaps only OMFG FINALLY is needed.

["puppet", "facter"].each |$file| {
    ensure => "link",
    target => "/opt/puppetlabs/bin/${file}"

More complex things like map and select exist too:

$domains = ["foo.com", "bar.com"]
$domain_definition = $domains.map |$domain| {
  {$domain => {"relay" => "mx.${domain}"}}

This yields a new hash made up of all the parts:

 "foo.com" => {"relay" => "mx.foo.com"},
 "bar.com" => {"relay" => "mx.bar.com"}

. syntax

If you’re from Ruby this might be a bit more bearable, you can use any function interchangably it seems:

$x = join(["a", "b"], ",")
$y = ["a", "b"].join(",")

Both result in a,b

Default Ordering

By default it now does manifest ordering. This is a big deal, I’ve had to write no ordering code at all. None. Not a single require or ordering arrows. It’s just does things top down by default but parameters like notifies and specific requires influence it. Such an amazingly massive time saver. Good times when things that were always obviously dumb ideas goes away.

It’s clever enough to also do things in the order they are included. So if you had:

class myapp {
  include myapp::install
  include myapp::config
  include myapp::service

Ordering will magically be right. Containment is still a issue though.

Facts hash

Ever since the first contributor summit I’ve been campaigning for $facts[“foo”] and it’s gone all round with people wanting to invent some new hash like construct and worse, but finally we have now a by default enabled facts hash.

Unfortunately we are still stuck with $settings::vardir but hopefully some new hash will be created for that.

It’s a reserved word everywhere so you can safely just do $facts[“location”] and not even have to worry about $::facts, though you might still do that in the interest of consistency.

Facter 3

Facter 3 is really fast:

$ time facter
facter  0.08s user 0.03s system 44% cpu 0.248 total

This makes everything better. It’s also structured data but this is still a bit awkward in Puppet:

$x = $facts["foo"]["bar"]["baz"]

There seems to be no elegant way to handle a missing ‘foo’ or ‘bar’ key, things just fail badly in ways you can’t catch or recover from. On the CLI you can do facter foo.bar.baz so we’re already careful to not have “.” in a key. We need some function to extract data from hashes like:

$x = $facts.fetch("foo.bar.baz", "default")

It’ll make it a lot easier to deal with.

Hiera 3

Hiera 3 is out and at first I thought it didn’t handle hashes well, but it does:

  - "%{facts.fqdn}"
  - "location_%{facts.location}"
  - "country_%{facts.country}"
  - common

That’s how you’d fetch values out of hashes and it’s pretty great. Notice I didn’t do ::facts that’s because facts is reserved so there’ll be no scope layering issues.

Much better parser

You can use functions almost everywhere:

$ puppet apply -e 'notify{hiera("rsyslog::client::server"): }'
Notice: loghost.example.net

There are an immeasurable amount of small improvements in things the old parser did badly, now it’s really nice to use, things just work the way I expect them to do from other languages.

Even horrible stuff like this works:

$x = hiera_hash("something")["foo"]

Which previously needed an intermediate variable.

puppet apply –test

A small thing but –test in apply now works like in agent – colors, verbose etc, really handy.

AIO Packaging

Of course by now almost everyone know we’re getting omnibus style packaging. I am a big supporter of this direction, the new bundled ruby is fast and easy to get onto older machines.

The execution of this is unspeakably bad though. It’s so half baked and leave so much to be desired.

Here’s a snippet from the current concat module:

    if defined('$is_pe') and str2bool("${::is_pe}") { # lint:ignore:only_variable_string
      if $::kernel == 'windows' {
        $command_path = "${::env_windows_installdir}/bin:${::path}"
      } else {
        $command_path = "/opt/puppetlabs/puppet/bin:/opt/puppet/bin:${::path}"
    } elsif $::kernel == 'windows' {
      $command_path = $::path
    } else {
      $command_path = "/opt/puppetlabs/puppet/bin:${::path}"
       path => $command_path

There are no words. Without this abomination it would try to use system ruby to run the #!/usr/bin/env ruby script. Seriously, if something ships that cause this kind of code to be written by users you’ve failed. Completely.

Things like the OS not being properly setup with symlinks into /usr/bin – can kind of understand it to avoid conflicts with existing Puppet, but meh, it just makes it feel unpolished and as if it comes without batteries included the RPM conflicts with puppet < 4.0.0 so it’s not that, it’s just comes without batteries included.

The file system choices are completely arbitrary:

# puppet apply --configprint vardir

This is intuitive to exactly no-one who has ever used any unix or windows or any computer.

Again, I totally support the AIO direction but the UX of this is so poor that while I’ve been really positive about Puppet 4 up to now I’d say this makes the entire thing be Alpha quality. The team absolutely must go back to the drawing board and consider how this is done from the perspective of usability by people who have likely used Unix before.

Users have decades of experience to build on and the system as a whole need to be coherent and compliment them – it should be a natural and comfortable fit. This and many other layout choice just does not make sense. Sure the location is arbitrary it makes no technical different if it’s in /opt/puppetlabs/puppet/cache or some other directory.

It DOES though make a massive difference cognitively to users when thinking of the option vardir and think of their entire career experience of what that mean and then cannot for the life of them find the place these files go without having to invest effort in finding it and then having to remember it as a odd one out. Even knowing things are in $prefix you still can’t find this dir because it’s now been arbitrarily renamed to cache and instead of using well known tools like find I now have to completely context switch.

Not only is this a senseless choice but frankly it’s insulting that this software seems to think it’s so special that I have to remember their crappy paths differently from any of the other 100s of programs out there. It’s not, it’s just terrible and makes it a nightmare to use. Sure put the stuff in /opt/puppetlabs, but don’t just go and make things up and deviate from what we’ve learned over years of supporting Puppet. It’s an insult.

Your users have invested countless hours in learning the software, countless hours in supporting others and in some cases Paid for this knowledge. Arbitrarily changing vardir to mean cache trivialise that investment and puts a unneeded toll on those of us who support others in the community.


There’s a whole lot more to show about Puppet 4, I’ve only been at it for a few nights after work but overall I am super impressed by the work done on Puppet Core. The packaging lets the efforts down and I’d be weary of recommending anyone go to Puppet 4 as a result, it’s a curiosity to investigate in your spare time while hopefully things improve on the packaging front to the level of a production usable system.

Some thoughts on operating containers


I recently blogged about my workflow improvements realised by using docker for some services. Like everyone else the full story about running containers in production is a bit of an unknown. I am running 7 or 8 things in containers at the moment but I have a lot of outstanding questions.

I could go the route of a private PaaS where you push an image or Dockerfile into it and forget about it. Hoping you never have to debug anything or dive deep into finding out why something is not performant as those tend to be very much closed systems. Some like deis are just Docker underneath but some others like the recently hyped lattice.cf unpacks the Docker container and transforms it into something else entirely that is much harder to interact with from a debug perspective. As a bit of an old school sysadmin this fire-and-hope-for-the-best approach leaves me a bit cold. I do not want to lose the ability to carefully observe my running containers using traditional tools if I have to. It’s great to strive for never having to do that, never having to touch a running app using any thing but your monitoring SaaS or that you can just always scale out horizontally but personally I feel I need a bit more closer to the bits interaction at times. Aim for that goal and get a much better overall system, but while you’ve not yet reached this nirvana like state you’re going to want to get at your running apps using strace if it has to.

So having ruled out just running one of the existing crop of private PaaS offerings locally I started thinking about what a container is really. I consider them to be analogous to a package so we need to first explore what Packages are. In it’s simplest form a package is just a bunch of files packaged up. So what makes it better than a tarball?

  • Metadata like name, version, build time, build host, dependencies, descriptions, licence, signature and urls
  • Built in logic like pre/post install scripts but also companion scripts like init system scripts, monitoring logic etc
  • An API to interact with this – the rpm or apt/deb commands – but like in the case of Yum also libraries for interacting with these

All of the above combines to bring the biggest and ultimate benefit from a package: Strong set of companion tools to build, host, deploy, validate, update and inspect those packages. You cannot have the main benefit from packages without the mature implementations of the preceding points.

To really put it in perspective, the Puppet or Chef package resources only works because of the combination of the above 3 points. Without them it will fail which is why the daily attempts by people on #puppet for example to reinvent packaging with a exec running wget and make ends up failing and yield the predictable answer of packaging up your software instead.

When I look at the current state of a docker container and the published approaches for building them I am left a bit wanting when I compare them to a mature package manager wrt to the 3 points above. This means without improvement I am going to end up with a unsatisfactory set of tools and interactions with my running apps.

So to address this I started looking at standardising my builds and creating a framework for building containers the way I like to and what kind of information I would be able to make available to create the tooling I think is needed. I do this using a base image that has a script called container in it that can introspect metadata about the image. Any image downstream from this base image can just add more metadata and hook into the life cycle my container management script provides. It’s not OS dependent so I wouldn’t be forcing any group into a OS choice and can still gain a lot of the advantages Docker brings wrt to making heterogeneous environments less painful. My build system embeds the metadata into any container it builds as JSON files.


There are lots going on in this space, Kubernetes has labels and Docker is getting metadata but these are tools to enable metadata, it is still up to users to decide what to do with it.

The reason you want to be able to really interact with and introspect packages come down to things like auditing them. Where do you have outdated SSL versions and the like. Likewise I want to know things about my containers and images:

  • Where and when was it built and why
  • What was it’s ancestor images
  • How do I start, validate, monitor and update it
  • What git repo is being built, what hash of that git repo was built
  • What are all the tags this specific container is known as at time of build
  • What’s the project name this belongs to
  • Have the ability to have arbitrary user supplied rich metadata

All that should be visible to the inside and outside of the container and kept for every ancestor of the container. Given this I can create rich generic management tools: I can create tools that do not require configuration to start, update and validate the functionality as well as monitor and extract metrics of any container without any hard coded logic.

Here’s an example:

% docker exec -ti rbldnsd container --metadata|json_reformat
  "validate_method": "/srv/support/bin/validate.sh",
  "start_method": "/srv/support/bin/start.sh",
  "update_method": "/srv/support/bin/update.sh"
  "validate": true,
  "build_cause": "TIMERTRIGGER",
  "build_tag": "jenkins-docker rbldnsd-55",
  "ci": true,
  "image_tag_names": [
  "project": "rbldnsd",
  "build_time": "2015-03-30 06:02:10",
  "build_time_stamp": 1427691730,
  "image_name": "ripienaar/rbldnsd",
  "gitref": "e1b0a445744fec5e584919711cafd8f4cebdee0e",

Missing from this is monitoring and metrics related bits as those are still a work in progress. But you can see here metadata for a lot of the stuff I mentioned. Images I build embeds this into the image, this means when I FROM one of my images I get a history, that I can examine:

% docker exec -ti rbldnsd container --examine
Container first started at 2015-03-30 05:02:37 +0000 (1427691757)
Container management methods:
   Container supports START method using command /srv/support/bin/start.sh
   Container supports UPDATE method using command /srv/support/bin/update.sh
   Container supports VALIDATE method using command /srv/support/bin/validate.sh
Metadata for image centos_base
            Project Name: centos_base
              Image Name: ripienaar/centos_base
         Image Tag Names: hub.my.net/ripienaar/centos_base
  Build Info:
                  CI Run: true
                Git Hash: fcb5f3c664b293c7a196c9809a33714427804d40
             Build Cause: TIMERTRIGGER
              Build Time: 2015-03-24 03:25:01 (1427167501)
               Build Tag: jenkins-docker centos_base-20
                   START: not set
                  UPDATE: not set
                VALIDATE: not set
Metadata for image rbldnsd
            Project Name: rbldnsd
              Image Name: ripienaar/rbldnsd
         Image Tag Names: hub.my.net/ripienaar/rbldnsd
  Build Info:
                  CI Run: true
                Git Hash: e1b0a445744fec5e584919711cafd8f4cebdee0e
             Build Cause: TIMERTRIGGER
              Build Time: 2015-03-30 06:02:10 (1427691730)
               Build Tag: jenkins-docker rbldnsd-55
                   START: /srv/support/bin/start.sh
                  UPDATE: /srv/support/bin/update.sh
                VALIDATE: /srv/support/bin/validate.sh

This is the same information as above but also showing the ancestor of this rbldnsd image – the centos_base image. I can see when they were built, why, what hashes of the repositories and I can see how I can interact with these containers. From here I can audit or manage their life cycle quite easily.

I’d like to add to this a bunch of run-time information like when was it deployed, why, to what node etc and will leverage the docker metadata when that becomes available or hack something up with ENV variables.

Solving this problem has been key to getting to grips of the operational concerns I had with Docker and feeling I can get back to the level of maturity I had with packages.


You can see from above that the metadata supports specifying START, UPDATE and VALIDATE actions. Future ones might be MONITOR and METRICS.

UPDATE requires some explaining. Of course the trend is toward immutable infrastructure where every change is a rebuild and this is a pretty good approach. I host things like a DNS based RBL and these tend to update all the time, I’d like to do so quicker and with less resource usage than a full rebuild and redeploy – but without ending up in a place where a rebuild loses my changes.

So the typical pattern I do this with is to make the data directories for these images be git checkouts using deploy keys on my git server. The build process will always take latest git and the update process will fetch latest git and reload the running config. This is a good middle ground somewhere between immutability and rapid change. I rebuild and redeploy all my containers every night so this covers the few hours in between.

Here’s my DNS server:

% sudo docker exec bind container --update
>> Fetching latest git checkout
From https://git.devco.net/ripienaar/docker_bind
 * branch            master     -> FETCH_HEAD
Already up-to-date.
>> Validating configuration
>> Checking named.conf syntax in master mode
>> Checking named.conf syntax in slave mode
>> Checking zones..
>> Reloading name server
server reload successful

There were no updates but you can see it would fetch the latest, validate it passes inspection and then reload the server if everything is ok. And here is the main part of the script implementing this action:

echo ">> Fetching latest git checkout"
git pull origin master
echo ">> Validating configuration"
container --validate
echo ">> Reloading name server"
rndc reload

This way I just need to orchestrate these standard container –update execs – webhooks does this in my case.

VALIDATE is interesting too, in this case validate uses the usual named-checkconf and named-checkzone commands to check the incoming config files but my more recent containers use serverspec and infrataster to validate the full end to end functionality of a running container.

% sudo docker exec -ti rbldnsd container --validate
Finished in 6.86 seconds (files took 0.39762 seconds to load)
29 examples, 0 failures

My dev process revolves around this like TDD would, my build process will run these steps end of every build in a running instance of the container, my deploy process runs this post deploy of anything it deploys. Operationally if anything is not working right my first port of call is just this command, it often gets me right down to the part that went wrong – if I have good tests that is, otherwise this is feedback to the dev cycle leading to improved tests. I mentioned I rebuild and redeploy the entire infrastructure daily – it’s exactly the investment in these tests that means I can do so while getting a good nights sleep.

Monitoring will likewise be extended around standardised introspectible commands so that a single method can be made to extract status and metric information out of any container built on this method.


I’m pretty happy with where this got me, I found it much easier to build some tooling around containers given rich metadata and standardised interaction models. I kind of hoped this was what I would get from Docker itself but it’s either too early or what it provides is too low level – understandable as from it’s perspective it would want to avoid being too prescriptive or have limited sets of data it supports on limited operating systems. I think though as a team who want to build and deploy a solid infrastructure on Docker you need to invest in something along these lines.

Thus my containers now do not just contain their files and dependencies but more and more their operational life cycle is part of the container. Containers can be asked for their health, they can update themselves and eventually emit detailed reusable metrics and statuses. The API to do all of this is standardised and I can run this anywhere with confidence gained from having these introspective abilities and metadata anywhere. Like the huge benefit I got from an improved workflow I find this embedded operational life cycle is equally large and something that I found hard to achieve in my old traditional CM based approach.

I think PaaS systems need to get a bit more of this kind of thing in their pipelines, I’d like to be able to ask my PaaS to just run my validate steps regularly or on demand. Or have standardised monitoring status and metrics output so that the likes of Datadog etc can deliver agents that provide in depth application monitoring without configuration by just sitting in a container next to a set of these containers. Today the state of the art for PaaS health checks seem to be to just hit the exposed port, but real life management of services is much more intricate than that. If they had that I could adopt one of those and spare myself a lot of pain.

For now though this is what my systems will do and hopefully some of the ideas become generally accepted.

Collecting links to free services for developers


A while ago Brandon Burton tweeted the following:

And I said someone should make a list. I looked around and could not find one so I decided I want to make such a list.

I am gathering links in a flat Markdown file at the moment but once I get an idea for the categories and kinds of links I’ll look at setting up a site with better UX than this big readme. If you have design chops for a site like this and want to help, get in touch.

Already I gathered quite a few and had some good links sent as PRs, if you have any links or if you work for a company who think they might fit the bill please send me links.

I am looking for links to services that provide free services especially to Open Source developers. Past that if they provide a developer account with some free resources – like a monitoring service that allows 5 free devices etc. I’d probably favour links good for infrastructure coders rather than say mobile app developers as there’s a huge list of those around.

I am not after services in Private Beta or Free During Beta or things like this, the only ones of those I’d accept are ones who specifically state that beta accounts will become free dev accounts or similar in future.

Older Posts