Back in September 2009 I wrote a blog post titled “Simple Puppet Module Structure” which introduced a simple approach to writing Puppet Modules. This post has been hugely popular in the community – but much has changed in Puppet since then so it is time for an updated version of that post.
As before I will show a simple module for a common scenario. Rather than considering this module a blueprint for every module out there you should instead study its design and use it as a starting point when writing your own modules. You can build on it and adapt it but the basic approach should translate well to more complex modules.
I should note that while I work for Puppet Labs I do not know if this reflect any kind of standard suggested approach by Puppet Labs – this is what I do when managing my own machines and no more.
The most important deliverables
When writing a module I have a few things I keep in mind – these are all centered around down stream users of my module and future-me trying to figure out what is going on:
- A module should have a single entry point where someone reviewing it can get an overview of it’s behavior
- Modules that have configuration should be configurable in a single way and single place
- Modules should be made up of several single-responsibility classes. As far as possible these classes should be private details hidden from the user
- For the common use cases, users should not need to know individual resource names
- For the most common use case, users should not need to provide any parameters, defaults should be used
- Modules I write should have a consistant design and behaviour
The module layout I will present below is designed so that someone who is curious about the behaviour of the module only have to look in the init.pp to see:
- All the parameters and their defaults used to configure the behaviour of the module
- Overview of the internal structure of the module by way of descriptive class names
- Relationships and notifications that exist inside the module and what classes they can notify
This design will never remove the need for documenting your modules but a clear design will guide your users in discovering the internals of your module and how they interact with it.
More important than what a module does is how accessible it is to you and others, how easy is it to understand, debug and extend.
Thinking about your module
For this post I will write a very simple module to manage NTP – it really is very simple, you should check the Forge for more complete ones.
To go from nowhere to having NTP on your machine you would have to do:
- Install the packages and any dependencies
- Write out appropriate configuration files with some environment specific values
- Start the service or services you need once the configuration files are written. Restart it if the config file change later.
There is a clear implied dependency chain here and this basic pattern applies to most pieces of software.
These 3 points basically translate to distinct groups of actions and sticking with the above principal of single function classes I will create a class for each group.
To keep things clear and obvious I will call these class install, config and service. The names don’t matter as long as they are descriptive – but you really should pick something and stick with it in all your modules.
Writing the module
I’ll show the 3 classes that does the heavy lifting here and discuss parts of them afterwards:
class ntp::install { package{'ntpd': ensure => $ntp::version } } class ntp::config { $ntpservers = $ntp::ntpservers File{ owner => root, group => root, mode => 644, } file{'/etc/ntp.conf': content => template('ntp/ntp.conf.erb'); '/etc/ntp/step-tickers': content => template('ntp/step-tickers.erb'); } } class ntp::service { $ensure = $ntp::start ? {true => running, default => stopped} service{"ntp": ensure => $ensure, enable => $ntp::enable, } } |
Here I have 3 classes that serve a single purpose each and do not have any details like relationships, ordering or notifications in them. They roughly just do the one thing they are supposed to do.
Take a look at each class and you will see they use variables like $ntp::version, $ntp::ntpservers etc. These are variables from the the main ntp class, lets take a quick look at that class:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | # == Class: ntp # # A basic module to manage NTP # # === Parameters # [*version*] # The package version to install # # [*ntpservers*] # An array of NTP servers to use on this node # # [*enable*] # Should the service be enabled during boot time? # # [*start*] # Should the service be started by Puppet class ntp( $version = "present", $ntpservers = ["1.pool.ntp.org", "2.pool.ntp.org"], $enable = true, $start = true ) { class{'ntp::install': } -> class{'ntp::config': } ~> class{'ntp::service': } -> Class["ntp"] } |
This is the main entry point into the module that was mentioned earlier. All the variables the module use is documented in a single place, the basic design and parts of the module is clear and you can see that the service class can be notified and the relationships between the parts.
I use the new chaining features to inject the dependencies and relationships here which surfaces these important interactions between the various classes back up to the main entry class for users to see easily.
All this information is immediately available in the obvious place without looking at any additional files or by being bogged down with implementation details.
Line 26 here requires some extra explanation – This ensures that all the NTP member classes are applied before this main NTP class so that cases where someone say require => Class[“ntp”] elsewhere they can be sure the associated tasks are completed. This is a light weight version of the Anchor Pattern.
Using the module
Let’s look at how you might use this module from knowing nothing.
Ideally simply including the main entry point on a node should be enough:
include ntp |
This does what you’d generally expect – installs, configures and starts the NTP service.
After looking at the init.pp you can now supply some new values for some of the parameters to tune it for your needs:
class{"ntp": ntpservers => ["ntp1.example.com", "ntp2.example.com"]} |
Or you can use the new data bindings in Puppet 3 and supply new data in Hiera to override these variables by supplying data for the keys like ntp::ntpservers.
Finally if for some or other related reason you need to restart the service you know from looking at the ntp class that you can notify the ntp::service class to achieve that.
Using classes for relationships
There’s a huge thing to note here in the main ntp class. I specify all relationships and notifies on the classes and not the resources themselves.
As personal style I only mention resources by name inside a class that contains that resource – if I ever have to access a resource outside of the class that it is contained in I access the class.
I would not write:
class ntp::service { service{"ntp": require => File["/etc/ntp.conf"]} } |
These are many issues with this approach that mostly come down to maintenance headaches. Here I require the ntp config file but what if a service have more than one file? Do you then list all the files? Do you later edit every class that reference these when another file gets managed?
These issues quickly multiply in a large code base. By always acting on class names and by creating many small single purpose classes as here I effectively contain these by grouping names and not individual resource names. This way any future refactoring of individual classes would not have an impact on other classes.
So the above snippet would rather be something like this:
class ntp::service { service{"ntp": require => Class["ntp::config"]} } |
Here I require the containing class and not the resource. This has the effect of requiring all resources inside that class. This has the effect of isolating changes to that class and avoiding a situation where users have to worry about the internal implementation details of the other class. Along the same lines you can also notify a class – and all resources inside that class gets notified.
I only include other classes at the top ntp level and never have include statements in my classes like ntp::confg and so forth – this means when I require the class ntp::config or notify ntp::service I get just what I want and no more.
If you create big complex classes you run the risk of having refreshonly execs that relate to configuration or installation associated with services in the same class which would have disastrous consequences if you notify the wrong thing or if a user do not study your code before using it.
A consistant style of small single purpose classes named descriptively avoid these and other problems.
What we learned and further links
There is a lot to learn here and much of it is about soft issues like the value of consistency and clarity of design and thinking about your users – and your future self.
On the technical side you should learn about the effects of relationships and notifications based on containing classes and not by naming resources by name.
And we came across a number of recently added Puppet features:
- Parameterized classes
- Chaining Arrows
- Data Bindings as introduced in Puppet 3
Parameterized Classes are used to provide multiple convenient methods for supplying data to your module – defaults in the module, specifically in code, using Hiera and (not shown here) an ENC.
Chaining Arrows are used in the main class to inject the dependencies and notifications in a way that is visible without having to study each individual class.
These are important new additions to Puppet. Some new features like Parameterised classes are not quite ready for prime time imho but in Puppet 3 when combined with the data bindings a lot of the pain points have been removed.
Finally there are a number of useful things I did not mention here. Specifically you should study the Puppet Style Guide and use the Puppet Lint tool to validate your modules comply. You should consider writing tests for your modules using rspec-puppet and finally share it on the Puppet Forge.
And perhaps most importantly – do not reinvent the wheel, check the Forge first.
Greater module structure post ever!
Just a question?
Where do you put external class requirements, at the same chaining resources?, for example:
{
class{‘ksh’: } ->
class{‘ntp::install’: } ->
class{‘ntp::config’: } ~>
class{‘ntp::service’: } ->
Class[“ntp”]
}
Thanks in advance
@juaningan that is a surprisingly difficult question 🙂
If you do it this then your users will not be able to modify any parameters to the ksh class in the future it also means they cant change to zsh later or whatever. Worse if they did include ksh or use the new syntax before using the ntp module the ntp module will cause the catalog compile to fail.
So on one hand I want to do:
But then where to include the ksh class? In a role class, common class? somewhere else? It's difficult to say - the answer is going to depend on your environment and needs 🙁
Personally I try to avoid doing it in a class like this - it seems to break the single responsibility for me - I tend to include the ksh class in a role or common class but that also feels wrong.
So I guess the answer is "it depends" and "I don't know" 🙁
You could create some kind of dependency injection system I guess:
Now users have the choice of supplying any class providing the shell feature and our NTP module will adjust itself.
In future I hope Puppet gets a system where a module can declare that it satisfies the shell dependency and other modules can say they require any module(s) that has setup the shell.
Ok, I think that prefer the common or role class way, but my fear is the catalog compilation order.
I’ll explain what I mean, I’m talking about an old-fashioned corporative application that installs through a ksh script (an exec resource). So I need to compliance that class ksh is installed before class{‘foo:install’ }…
Till your post I used to do:
class foo::install {
require ksh
.
.
.
}
Thanks again for your enriching posts.
@juaningan today require will still work as before – and your proposed initial code will do the right thing too.
Sometimes – when like in your case you just have some legacy stuff that just works in a dumb way – you have little choice but to strongly couple modules. That is fine, you should then think about how to make that coupling as clear as possible.
If you feel your co-workers are most likely to look at the ntp module when they expect to find the relationship, then put it there. If you think it’s elsewhere then put it there.
Might I suggest a follow-on post concerning anchors? Seems like you would run into that issue pretty quick if you’re trying to encapsulate the internal functionality of your module.
@lavaman I’ve never found the need for the full pattern in any of my manifests and only really once of the light weight one shown here
no, rip, please do not insist with this module pattern. I hope we can better discuss it at voice but here are few points:
– you multiply objects that puppet has to manage , serialize and deserialize for a imho limited advantage: for a simple package/ service/ config files you introduce useless subclasses just to better manage dependencies and have a cleaner modules structure…. there are quicker and most effective ways to do that
– if you happen to need to override resources contained in two of these different subclasses you must add two custom subclasses that inherits your two classses… it’ll become an hell
– you miss the most important feature that every module should have: true reusability. that means the option to adapt the modules to different situations without changing it. here i don’ t see a critical parameter for that: the option to provide a custom template file to manage configurations with true freedom and custom content.
sorry for the direct tone, the ipad keyboard doesn’t help verbosity… no harsh intended, but i really consider this pattern not effective, and a bad example for modules suggestions, even if written also on pro puppet. just my 2 cents,
alessandro
@al thanks for the comments but I think you missed the main point that this is the most basic starting point to build from.
It shows how to add parameters and where to add them, if you wanted to override templates for example then add more parameters – this was as indicated not meant to be an exhaustive full featured module instead a simple enough one to show the main concepts without over engineering
Towards your points:
– If multiple objects like this is a problem,then thats a problem to fix in Puppet – but they’re not. They provide a clear advantage for code managability and clarity
– I have in all my time using Puppet had to override exactly 2 resources, you can avoid overrides and inheritance which are weak points in Puppet by using the system in a different way
– The module as it stands here is quite limited but building on this will yield a reusable module, one thats configurable by data and not needless inheritance and code
I think we’re going to have to agree to disagree, I’ve build big multi tenant PAAS’s using this pattern and it’s widely used in the community, easy to grasp and solve a high % of problem without requiring a lot of in depth deep technical knowledge that borders on complex programming. This pattern resonates well with many in our community and strikes a good balance between being pragmatic in solving a problem and being extensible.
It is absolutely not the case that each module has to be truely reusable. It’s a fact that a very high percentage of modules never get shared or reused outside of the environment they get developed in. Those modules are single use and single target audience in a closed environment where they have to solve one problem in a pragmatic way and have zero chance of ever being shared. This describes the majority of Puppet modules. This pattern will help those who wish to just get on with their job to do so by investing 20% of time to achieve their goals and not wasting 80% of their time making things needlessly complex to achieve reusability and flexibility goals that simply do not exist in their world or requirements.
It’s a matter of target audience, your approach has it’s audience and this has a different one.
Learn to live with the fact that not everyone share your goals. The tone of your comment does nothing to help make your point and advance your ideas as being credible, the opposite in fact is what is being achieved.
Thanks for the reply, rip, it clarifies some points (more arguments can be added to make the module more reusable) and it insists on some patterns I don’t agree with.
That’s life and I suppose, and as you suggested, I’ve to cope with that.
On some points I agree with you, thought, so there’s hope in this world:
– Module’s data (and behaviour) should be managed via its parameters
– Module’s can be made for different targets:
— the ones you make for yourself and have to work for your site,
— the ones you want to be reusable for you or others, changing and adapting them to different circumstances,
— the ones you want to be really reusable, in a way that you can get the module and do with it whatever you need without changing it: just by passing parameters.
All my modules work is geared towards this third approach (which incidentally I suppose it probably should be at the basis on every module PUBLISHED on the Forge, even if reality is quite different)
– The more you avoid class inheritance to override hardcoded resources parameters, the better, but you can do this only by changing the module itself or adding parameters that affect the module’s behaviour.
A note on credibility, as your last words has left some bitterness in my mouth, even if I can understand that my nightly rushed and somehow direct tones could deserve it.
I’ve been releasing Puppet modules for 5 years, adapting and changing them during the years without biases, either because the language evolved or because someone has given me ideas on how to do things in a better way (at the beginning I was not convinced at all by parametrized classes, now all my ones are parametrized), I started to talk about modules’ reusability and standards before the Forge was released, started to use a pattern like params_lookup before data bindings where revealed, if GitHub metrics can be considered as a reference for adoption or inspiration, my modules’ are among the most followed ones and, most of all, I can say that I’ve various Puppet implementations among customer’s entirely based on the upstream version of my modules, with only a site module that contains local logic and data.
That’s my own, very personal, proof of true module reusability: exactly the same modules used, as is, in totally different environments.
But, yes, mileages may vary, and complete reusability is not everything and it brings complexities and overheads that people might not want.
Still, if you’ll manage to convince me that having separated classes for service/packages/files bring more pros than cons, I do not have problems in changing my ideas.
At the moment that doesn’t appear to be likely, but maybe a belgian beer may help.
This seems like a nice design. But what do you do with the fairly common case where a module has a dual functionality of managing both a server and a client?
Do you split them into two different modules? (can result in some duplication of templates that could be shared otherwise)
Do you have a class “foo” and “foo::server”? Or “foo::client” & “foo::server” to be more clear?
@dalen thanks for the comment.
I tend to then make 2 entry point classes, lets say bind::master and bind::slave or in other cases I keep the most common one as the main class like mcollective but then have mcollective::client class.
In these classes I reused as much as possible for example I might set a variable in bind::master that informs bind::config about which behaviours to activate
Very interesting post, I especially like the lightweight anchor pattern. I have a question, though:
The way the subclasses are structured/ordered here, NTP installation, configuration and service would all be “completed” before the NTP class itself would “complete”:
class ntp {
class{'ntp::install': } ->
class{'ntp::config': } ~>
class{'ntp::service': } ->
Class["ntp"]
}
But what if I want to use an NTP package provided by a different (APT or Yum) repository? The following would not work as expected/intended, given the structure above:
class role::basic {
include 'repo::ntp_custom'
include 'ntp'
Class['repo::ntp_custom'] -> Class['ntp']
}
node 'somenode' {
include 'role::basic'
}
The installation, configuration and service subclasses of the NTP module would most probably complete before the “repo::ntp_custom” class is applied, since the ordering only specifies that the NTP subclasses and the repo class need to complete before the “ntp” class itself. It doesn’t state that the repo class should come before the NTP subclasses, specifically the “ntp::install” class. So the default, OS-provided NTP package would be used instead of the one provided by the custom repository.
How would one counter this? I suppose the “anchor” resource from stdlib is made for this situation, but can it be done also using the lightweight pattern you presented? Would the following work without any adverse or strange side effects?
class ntp {
Class["ntp"] ->
class{'ntp::install': } ->
class{'ntp::config': } ~>
class{'ntp::service': }
}
If you do that you loose the other half of the benefit so it’s not really a good solution – for this kind of thing the anchor is sadly the right thing to do generally,
That said, often with repos I don’t think there is any reason why the repo has to be done before the entire NTP class. Instead the repo should be done before packages which you can achieve with:
This assumes you don’t have virtual packages. I prefer adding just the right minimum dependencies in cases like this but really it depends on the individual case.
If this really was just a NTP related repo – as in your example perhaps restricted using pins – I’d just put it before the ntp::install class and not the whole ntp class. The reason the classes and ordering is visible in the init.pp in this pattern is so you can easily discover that and use the information like that.
I see, thank you for the insight. I guess the only problem I have with doing
is that it requires me to know that the NTP module has an “ntp::install” class. This is of course fairly standard but it nonetheless exposes an implementation detail. But then again, Puppet classes are not really comparable with classes as we know them from OOP.
I am torn between using the real anchor pattern for this matter and just making sure the repo class is applied before the install class. For anyone wondering, using the anchor resource could look like this:
This would allow doing the following:
Why do you say parametrized classes are not ready for prime time?
That’s a topic for another day – it’s a huge topic, several long threads on the list about it so would rather not go into that here 🙂
Omitted from this useful article was an example of how to wire up a notification outside the classes implementing the heavy lifting. This can be done by replacing
class{'ntp::config': } ~> class{'ntp::service': }
with
class{'ntp::config': notify => Class['ntp::service']} ~> class{'ntp::service': }
The ~> in my code is a notify, so your 2 lines are identical
Looking at the code:
You’ll get the output which shows the notify happened:
I totally missed the twiddle. Perhaps it really is time for glasses…
Shouldn’t $enable and $start ntp class variables be dependent on $version? Eg
$enable = $version? { absent => false, default => true },
$start = $version? { absent => false, default => true },
Without this, when you set $ntp::version to absent, you will run into an error about ensuring that the service is running and/or enabled in runlevels while actually has just been removed. Or maybe it’s not a good practice to create modules that way, and if you want to be sure that the service is removed (with packages/configurations/etc), you should create a different module for this?
yeah sounds right – the module here in-line in the blog isn’t about being absolutely correct and functional and more about communicating concept, lots is missing
Thanks for this. As a newbie to Puppet, I need examples and docs like this so I can try and get off the ground from a frivolous Module to a meaningful module.
D.