Puppet 4 data lookup strategies

02/03/2016

I recently wrote about the new Data in Modules support in Puppet 4, there’s another new feature that goes hand in hand with this to finally rid us of functions like hiera_hash() and such.

Up to now we’ve had to do something ugly like this to handle merged class parameters:

class users($local = hiera_hash("users::local", {}) {
 ...
}

This is functional but quite ugly and ties your module to having hiera. While these days it’s a reasonably safe assumption but with the ability to specify different environment data sources this will not always be the case. For example there’s a new kid on the block called Jerakia that lives in this world so having Hiera specific calls in modules is going to be a limiting strategy.

A much safer abstraction is to be able to rely on the automatic parameter lookup feature – but it had no way to know about the fact that this item should be a hash merge and so the functions were used as above.

Worse things like merge strategies were set globally, a module could not say a certain key should be deep merged and others just shallow merged etc, and if a module required a specific way it had no control over this.

A solution for this problem landed in recent Puppet 4 via a special merged hash called lookup_options. This is documented quite lightly in the official docs so I thought I’ll put up a example here.

lookup() function


To understand how this work you first have to understand the lookup() function, it’s documented here. But this is basically the replacement for the various hiera() functions and have a matching puppet lookup CLI tool.

If you wanted to do a hiera_hash() lookup that is doing the old deeper hash merge you’d do something like:

$local = lookup("users::local", Hash, {"strategy" => "deep", "merge_hash_arrays" => true})

This would merge just this key rather than say setting the merge strategy to deeper globally in hiera and it’s something the module author can control. The Hash above describes the data type the result should match and support all the various complex composite type definitions so you can really in detail describe the desired result data – almost like a Schema.

There are much more to the lookup function and it’s CLI, they’re both pretty awesome and you can now see where data comes from etc, I guess there’s a follow up blog post about that coming.

lookup_options hiera key


We saw above how to instruct the lookup() function to do a hiera_hash() but wouldn’t it be great if we could somehow tell Puppet that a specific key should always be merged in this way? That way a simple lookup(“users::local”) would do the merge and crucially so would the automatic parameter lookups – even across backends and data providers.

We just want:

class users(Hash $local = {}) {
 ...
}

For this to make sense the users module must be able to indicate this in the data layer. And since we now have data in modules there’s a obvious place to put this.

If you set up the users module here to use the hiera data service for data in modules as per my previous blog post you can now specify the merge strategy in your data:

# users/data/common.yaml
lookup_options:
  users::local:
    merge:
      strategy: deep
      merge_hash_arrays: true

Note how this match exactly the following lookup():

$local = lookup("users::local", Hash, {"strategy" => "deep", "merge_hash_arrays" => true})

The data type validation is done on the class parameters where it will also validate specifically specified data and the strategies for processing the data is in the module data level.

The way this works is that puppet will do a lookup_options lookup from the data source that is merged together – so you could set this at site level as well – but there is a check to ensure a module can only set keys for itself so it can not change behaviours of other modules.

At this point a simple lookup(“users::local”) will do the merge and therefore so will this code:

class users(Hash $local = {}) {
 ...
}

No more hiera_hash() here. The old hiera() function is not aware of this – it’s a lookup() feature but with this in place we’ll hopefully never see hiera*() functions being used in Puppet 4 modules.

This is a huge win and really shows what can be done with the Data in Modules features and something that’s been impossible before. This really brings the automatic parameter lookup feature a huge way forward and combines for me to be one of the most compelling features of Puppet 4.

I am not sure who proposed this behaviour, the history is a bit muddled but if someone can tweet me links to mailing list threads or something I’ll link them here for those who want to discover the background and reasoning that went into it. UPDATE: Henrik informs me that Rob Nelson was the driving force on this – it’s something they wanted to do for a while but really without Rob sticking with it and working with the devs it would not have been done.

Wishlist

The lookup function and the options are a great move forward however I find the UX of the various lookup options and merge strategies etc quite bad. It’s really hard for me to go from reading the documentation to knowing what a certain option will do with my data – in fact I still have no idea what some of these do the only way to discover it seems to be just spending time playing with it which I haven’t had, it would be great for new users to get some more clarity there.

Some doc updates that provide a translation from old Hiera terms to new strategies would be great and maybe some examples of what these actually do.