Better Puppet Modules Using Hiera Data

12/08/2013

When writing Puppet Modules there tend to be a ton of configuration data – generally things like different paths for different operating systems. Today the general pattern to manage this data is a class module::param with a bunch of logic in it.

Here’s a simplistic example below – for an example of the full horror of this pattern see the puppetlabs-ntp module.

# ntp/manifests/init.pp
class ntp (
     $config = $ntp::params::config,
     $keys_file = $ntp::params::keys_file
   ) inherits ntp::params {
 
   file{$config:
      ....
   }
}

# ntp/manifests/params.pp
class ntp::params {
   case $::osfamily {
      'AIX': {
         $config = "/etc/ntp.conf"
         $keys_file = '/etc/ntp.keys'
      }
 
      'Debian': {
         $config = "/etc/ntp.conf"
         $keys_file = '/etc/ntp/keys'
      }
 
      'RedHat': {
         $config = "/etc/ntp.conf"
         $keys_file = '/etc/ntp/keys'
      }
 
      default: {
         fail("The ${module_name} module is not supported on an ${::osfamily} based system.")
      }
   }
}

This is the exact reason Hiera exists – to remove this kind of spaghetti code and move it into data, instinctively now whenever anyone see code like this they think they should refactor this and move the data into Hiera.

But there’s a problem. This works for your own modules in your own repos, you’d just use the Puppet 3 automatic parameter bindings and override the values in the ntp class – not ideal, but many people do it. If however you wanted to write a module for the Forge though there’s a hitch because the module author has no idea what kind of hierarchy exist where the module is used. If the site even used Hiera and today the module author can’t ship data with his module. So the only sensible thing to do is to embed a bunch of data in your code – the exact thing Hiera is supposed to avoid.

I proposed a solution to this problem that would allow module authors to embed data in their modules as well as control the Hierarchy that would be used when accessing this data. Unfortunately a year on we’re still nowhere and the community – and the forge – is suffering as a result.

The proposed solution would be a always-on Hiera backend that as a last resort would look for data inside the module. Critically the module author controls the hierarchy when it gets to the point of accessing data in the module. Consider the ntp::params class above, it is a code version of a Hiera Hierarchy keyed on the $::osfamily fact. But if we just allowed the module to supply data inside the module then the module author has to just hope that everyone has this tier in their hierarchy – not realistic. My proposal then adds a module specific Hierarchy and data that gets consulted after the site Hierarchy.

So lets look at how to rework this module around this proposed solution:

# ntp/manifests/init.pp
class ntp ($config, $keysfile)  {
   validate_absolute_path($config)
   validate_absolute_path($keysfile)
 
   file{$config:
      ....
   }
}

Next you configure Hiera to consult a hierarchy on the $::osfamily fact, note the new data directory that goes inside the module:

# ntp/data/hiera.yaml
---
:hierarchy:
  - "%{::osfamily}"

And finally we create some data files, here’s just the one for RedHat:

# ntp/data/RedHat.yaml
---
ntp::config: /etc/ntp.conf
ntp::keys_file: /etc/ntp/keys

Users of the module could add a new OS without contributing back to the module or forking the module by simply providing similar data to the site specific hierarchy leaving the downloaded module 100% untouched!

This is a very simple view of what this pattern allows, time will tell what the community makes of it. There are many advantages to this over the ntp::params pattern:

This helps the contributor to a public module:

  • Adding a new OS is easy, just drop in a new YAML file. This can be done with confidence as it will not break existing code as it will only be read on machines of the new OS. No complex case statements or 100s of braces to get right
  • On a busy module when adding a new OS they do not have to worry about complex merge problems, working hard at rebasing or any git escoteria – they’re just adding a file.
  • Syntactically it’s very easy, it’s just a YAML file. No complex case statements etc.
  • The contributor does not have to worry about breaking other Operating Systems he could not test on like AIX here. The change is contained to machines for the new OS
  • In large environments this help with change control as it’s just data – no logic changes

This helps the maintainer of a module:

  • Module maintenance is easier when it comes to adding new Operating Systems as it’s simple single files
  • Easier contribution reviews
  • Fewer merge commits, less git magic needed, cleaner commit history
  • The code is a lot easier to read and maintain. Fewer tests and validations are needed.

This helps the user of a module:

  • Well written modules now properly support supplying all data from Hiera
  • He has a single place to look for the overridable data
  • When using a module that does not support his OS he can deploy it into his site and just provide data instead of forking it

Today I am releasing my proposed code as a standalone module. It provides all the advantages above including the fact that it’s always on without any additional configuration needed.

It works exactly as above by adding a data directory with a hiera.yaml inside it. The only configuration being considered in this hiera.yaml is the hierarchy.

This module is new and does some horrible things to get itself activated automatically without any configuration, I’ve only tested it on Puppet 3.2.x but I think it will work in 3.x as is. I’d love to get feedback on this from users.

If you want to write a forge module that uses this feature simply add a dependency on the ripienaar/module_data module, soon as someone install this dependency along with your module the backend gets activated. Similarly if you just want to use this feature in your own modules, just puppet module install ripienaar/module_data.

Note though that if you do your module will only work on Puppet 3 or newer.

It’s unfortunate that my Pull Request is now over a year old and did not get merged and no real progress is being made. I hope if enough users adopt this solution we can force progress rather than sit by and watch nothing happen. Please send me your feedback and use this widely.