Simple Puppet Module Structure Redux

Back in September 2009 I wrote a blog post titled “Simple Puppet Module Structure” which introduced a simple approach to writing Puppet Modules. This post has been hugely popular in the community – but much has changed in Puppet since then so it is time for an updated version of that post.

As before I will show a simple module for a common scenario. Rather than considering this module a blueprint for every module out there you should instead study its design and use it as a starting point when writing your own modules. You can build on it and adapt it but the basic approach should translate well to more complex modules.

I should note that while I work for Puppet Labs I do not know if this reflect any kind of standard suggested approach by Puppet Labs – this is what I do when managing my own machines and no more.

The most important deliverables

When writing a module I have a few things I keep in mind – these are all centered around down stream users of my module and future-me trying to figure out what is going on:

A module should have a single entry point where someone reviewing it can get an overview of it’s behavior
Modules that have configuration should be configurable in a single way and single place
Modules should be made up of several single-responsibility classes. As far as possible these classes should be private details hidden from the user
For the common use cases, users should not need to know individual resource names
For the most common use case, users should not need to provide any parameters, defaults should be used
Modules I write should have a consistant design and behaviour

The module layout I will present below is designed so that someone who is curious about the behaviour of the module only have to look in the init.pp to see:

All the parameters and their defaults used to configure the behaviour of the module
Overview of the internal structure of the module by way of descriptive class names
Relationships and notifications that exist inside the module and what classes they can notify

This design will never remove the need for documenting your modules but a clear design will guide your users in discovering the internals of your module and how they interact with it.

More important than what a module does is how accessible it is to you and others, how easy is it to understand, debug and extend.

Thinking about your module

For this post I will write a very simple module to manage NTP – it really is very simple, you should check the Forge for more complete ones.

To go from nowhere to having NTP on your machine you would have to do:

Install the packages and any dependencies
Write out appropriate configuration files with some environment specific values
Start the service or services you need once the configuration files are written. Restart it if the config file change later.

There is a clear implied dependency chain here and this basic pattern applies to most pieces of software.

These 3 points basically translate to distinct groups of actions and sticking with the above principal of single function classes I will create a class for each group.

To keep things clear and obvious I will call these class install, config and service. The names don’t matter as long as they are descriptive – but you really should pick something and stick with it in all your modules.

Writing the module
I’ll show the 3 classes that does the heavy lifting here and discuss parts of them afterwards:

class ntp::install {
   package{'ntpd':
      ensure => $ntp::version
   }
}
 
class ntp::config {
   $ntpservers = $ntp::ntpservers
 
   File{
      owner   => root,
      group   => root,
      mode    => 644,
   }
 
   file{'/etc/ntp.conf':
         content => template('ntp/ntp.conf.erb');
 
        '/etc/ntp/step-tickers':
         content => template('ntp/step-tickers.erb');
    }
}
 
class ntp::service {
   $ensure = $ntp::start ? {true => running, default => stopped}
 
   service{"ntp":
      ensure  => $ensure,
      enable  => $ntp::enable,
   }
}

Here I have 3 classes that serve a single purpose each and do not have any details like relationships, ordering or notifications in them. They roughly just do the one thing they are supposed to do.

Take a look at each class and you will see they use variables like $ntp::version, $ntp::ntpservers etc. These are variables from the the main ntp class, lets take a quick look at that class:

# == Class: ntp
#
# A basic module to manage NTP
#
# === Parameters
# [*version*]
#   The package version to install
#
# [*ntpservers*]
#   An array of NTP servers to use on this node
#
# [*enable*]
#   Should the service be enabled during boot time?
#
# [*start*]
#   Should the service be started by Puppet
class ntp(
   $version = "present",
   $ntpservers = ["1.pool.ntp.org", "2.pool.ntp.org"],
   $enable = true,
   $start = true
) {
   class{'ntp::install': } ->
   class{'ntp::config': } ~>
   class{'ntp::service': } ->
   Class["ntp"]
}

This is the main entry point into the module that was mentioned earlier. All the variables the module use is documented in a single place, the basic design and parts of the module is clear and you can see that the service class can be notified and the relationships between the parts.

I use the new chaining features to inject the dependencies and relationships here which surfaces these important interactions between the various classes back up to the main entry class for users to see easily.

All this information is immediately available in the obvious place without looking at any additional files or by being bogged down with implementation details.

Line 26 here requires some extra explanation – This ensures that all the NTP member classes are applied before this main NTP class so that cases where someone say require => Class[“ntp”] elsewhere they can be sure the associated tasks are completed. This is a light weight version of the Anchor Pattern.

Using the module

Let’s look at how you might use this module from knowing nothing.

Ideally simply including the main entry point on a node should be enough:

include ntp

This does what you’d generally expect – installs, configures and starts the NTP service.

After looking at the init.pp you can now supply some new values for some of the parameters to tune it for your needs:

class{"ntp": ntpservers => ["ntp1.example.com", "ntp2.example.com"]}

Or you can use the new data bindings in Puppet 3 and supply new data in Hiera to override these variables by supplying data for the keys like ntp::ntpservers.

Finally if for some or other related reason you need to restart the service you know from looking at the ntp class that you can notify the ntp::service class to achieve that.

Using classes for relationships

There’s a huge thing to note here in the main ntp class. I specify all relationships and notifies on the classes and not the resources themselves.

As personal style I only mention resources by name inside a class that contains that resource – if I ever have to access a resource outside of the class that it is contained in I access the class.

I would not write:

class ntp::service {
   service{"ntp": require => File["/etc/ntp.conf"]}
}

These are many issues with this approach that mostly come down to maintenance headaches. Here I require the ntp config file but what if a service have more than one file? Do you then list all the files? Do you later edit every class that reference these when another file gets managed?

These issues quickly multiply in a large code base. By always acting on class names and by creating many small single purpose classes as here I effectively contain these by grouping names and not individual resource names. This way any future refactoring of individual classes would not have an impact on other classes.

So the above snippet would rather be something like this:

class ntp::service {
   service{"ntp": require => Class["ntp::config"]}
}

Here I require the containing class and not the resource. This has the effect of requiring all resources inside that class. This has the effect of isolating changes to that class and avoiding a situation where users have to worry about the internal implementation details of the other class. Along the same lines you can also notify a class – and all resources inside that class gets notified.

I only include other classes at the top ntp level and never have include statements in my classes like ntp::confg and so forth – this means when I require the class ntp::config or notify ntp::service I get just what I want and no more.

If you create big complex classes you run the risk of having refreshonly execs that relate to configuration or installation associated with services in the same class which would have disastrous consequences if you notify the wrong thing or if a user do not study your code before using it.

A consistant style of small single purpose classes named descriptively avoid these and other problems.

What we learned and further links

There is a lot to learn here and much of it is about soft issues like the value of consistency and clarity of design and thinking about your users – and your future self.

On the technical side you should learn about the effects of relationships and notifications based on containing classes and not by naming resources by name.

And we came across a number of recently added Puppet features:

Parameterized classes
Chaining Arrows
Data Bindings as introduced in Puppet 3

Parameterized Classes are used to provide multiple convenient methods for supplying data to your module – defaults in the module, specifically in code, using Hiera and (not shown here) an ENC.

Chaining Arrows are used in the main class to inject the dependencies and notifications in a way that is visible without having to study each individual class.

These are important new additions to Puppet. Some new features like Parameterised classes are not quite ready for prime time imho but in Puppet 3 when combined with the data bindings a lot of the pain points have been removed.

Finally there are a number of useful things I did not mention here. Specifically you should study the Puppet Style Guide and use the Puppet Lint tool to validate your modules comply. You should consider writing tests for your modules using rspec-puppet and finally share it on the Puppet Forge.

And perhaps most importantly – do not reinvent the wheel, check the Forge first.