My recent post about using Hiera data in modules has had a great level of discussion already, several thousand blog views, comments, tweets and private messages on IRC. Thanks for the support and encouragement – it’s clear this is a very important topic.
I want to expand on yesterdays post by giving some background information on the underlying motivations that caused me to write this feature and why having it as a forge module is highly undesirable but the only current option.
At the heart of this discussion is the params.pp pattern and general problems with it. To recap, the basic idea is to embed all your default data into a file params.pp typically in huge case statements and then reference this data as default. Some examples of this are the puppetlabs-ntp module, the Beginners Guide to Modules and the example I had in the previous post that I’ll reproduce below:
# ntp/manifests/init.pp class ntp ( # allow for overrides using resource syntax or data bindings $config = $ntp::params::config, $keys_file = $ntp::params::keys_file ) inherits ntp::params { # validate values supplied validate_absolute_path($config) validate_absolute_path($keys_file) # optionally derive new data from supplied data # use data file{$config: .... } } |
# ntp/manifests/params.pp class ntp::params { # set OS specific values case $::osfamily { 'AIX': { $config = "/etc/ntp.conf" $keys_file = '/etc/ntp.keys' } 'Debian': { $config = "/etc/ntp.conf" $keys_file = '/etc/ntp/keys' } 'RedHat': { $config = "/etc/ntp.conf" $keys_file = '/etc/ntp/keys' } default: { fail("The ${module_name} module is not supported on an ${::osfamily} based system.") } } } |
Now today as Puppet stands this is pretty much the best we can hope for. This achieves a lot of useful things:
- The data that provides OS support is contained and separate
- You can override it using resource style syntax or Puppet 3 data bindings
- The data provided using any means are validated
- New data can be derived by combining supplied or default data
You can now stick this module on the forge and users can use it, it supports many Operating Systems and pretty much works on any Puppet going back quite a way. These are all good things.
The list above also demonstrates the main purpose for having data in a module – different OS/environment support, allowing users to supply their own data, validation and to transmogrify the data. The params.pp pattern achieves all of this.
So what’s the problem then?
The problem is: the data is in the code. In the pre extlookup and Hiera days we put our site data in a case statements or inheritance trees or node data or any of number of different solutions. These all solved the basic problem – our site got configured and our boxes got built just like the params.pp pattern solves the basic problem. But we wanted more, we wanted our data separate from our code. Not only did it seem natural because almost every other known programming language supports and embrace this but as Puppet users we wanted a number of things:
- Less logic, syntax, punctuation and “programming” and more just files that look a whole lot like configuration
- Better layering than inheritance and other tools at our disposal allowed. We want to structure our configuration like we do our DCs and environments and other components – these form a natural series of layered hierarchies.
- We do not want to change code when we want to use it, we want to configure that code to behave according to our site needs. In a CM world data is configuration.
- If we’re in a environment that do not let us open source our work or contribute to open source repositories we do not want to be forced to fork and modify open source code just to use it in our environments. We want to configure the code. Compliance needs should not force us to solve every problem in house.
- We want to plug into existing data sources like LDAP or be able to create self service portals for our users to supply this configuration data. But we do not want to change our manifests to achieve this.
- We do not want to be experts at using source control systems. We use them, we love them and agree they are needed. But like everything less is more. Simple is better. A small simple workflow we can manage at 2am is better than a complex one.
- We want systems we can reason about. A system that takes configuration in the form of data trumps one that needs programming to change its behaviour
- Above all we want a system that’s designed with our use cases in mind. Our User Experience needs are different from programmers. Our data needs are different and hugely complex. Our CM system must both guide in its design and be compatible with our existing approaches. We do not want to have to write our own external node data sources simply because our language do not provide solid solutions to this common problem.
I created Hiera with these items in mind after years of talking to probably 1000+ users and iterating on extlookup in order to keep pace with the Puppet language gaining support for modern constructs like Hashes. True it’s not a perfect solution to all these points – transparency of data origin to name but one – but there are approaches to make small improvements to achieve these and it does solve a high % of the above problems.
Over time Hiera has gained a tremendous following – it’s now the de facto standard to solving the problem of site configuration data largely because it’s pragmatic, simple and designed to suit the task at hand. In recognition of this I donated the code to Puppet Labs and to their credit they integrated it as a default prerequisite and created the data binding systems. The elephant in the room is our modules though.
We want to share our modules with other users. To do this we need to support many operating systems. To do this we need to create a lot of data in the modules. We can’t use Hiera to do this in a portable fashion because the module system needs improvement. So we’re stuck in the proverbial dark ages by embedding our data in code and gaining none of the advantages Hiera brings to site data.
Now we have a few options open to us. We can just suck it up and keep writing params.pp files gaining none of the above advantages that Hiera brings. This is not great and the puppetlabs-ntp module example I cited shows why. We can come up with ever more elaborate ways to wrap and extend and override the data provided in a params.pp or even far out ideas like having the data binding system query the params.pp data directly. In other words we can pander to the status quo, we can assume we cannot improve the system instead we have to iterate on an inherently bad idea. The alternative is to improve Puppet.
Every time the question of params.pp comes up the answer seems to be how to improve how we embed data in the code. This is absolutely the wrong answer. The answer should be how do we improve Puppet so that we do not have to embed data in code. We know people want this, the popularity and wide adoption of Hiera has shown that they do. The core advantages of Hiera might not be well understood by all but the userbase do understand and treasure the gains they get from using it.
Our task is to support the community in the investment they made in Hiera. We should not be rewriting it in a non backwards compatible way throwing away past learnings simply because we do not want to understand how we got here. We should be iterating with small additions and rounding out this feature as one solid ever present data system that every user of Puppet can rely on being present on every Puppet system.
Hiera adoption has reached critical mass, it’s now the solution to the problem. This is a great and historical moment for the Puppet Community, to rewrite it or throw it away or propose orthogonal solutions to this problem space is to do a great disservice to the community and the Puppet product as a whole.
Towards this I created a Hiera backend that goes a way to resolve this in a way thats a natural progression of the design of Hiera. It improves the core features provided by Puppet in a way that will allow better patterns than the current params.pp one to be created that will in the long run greatly improve the module writing and sharing experience. This is what my previous blog post introduce, a way forward from the current params.pp situation.
Now by rights a solution to this problem belong in Puppet core. A Puppet Forge dependant module just to get this ability, especially one not maintained by Puppet Labs, especially one that monkey patches its way into the system is not desirable at all. This is why the code was a PR first. The only alternatives are to wait in the dark – numerous queries by many members of the community to the Puppet product owner has yielded only vague statements of intent or outcome. Or we can take it on our hands to improve the system.
So I hope the community will support me in using this module and work with me to come up with better patterns to replace the params.pp ones. Iterating on and improving the system as a whole rather than just suck up the status quo and not move forward.
Seems like I’m missing something.
Your pull request was merged and closed according to https://projects.puppetlabs.com/issues/16856#note-48 so is this feature active in 3.3.0 or not?
No what was merged was a complete rewrite of hiera that apparently would support the proposed feature. The new hiera was a preview feature that was disabled by default.
Seems it does not actually work and caused other problems so PL said they’re removing the feature.
No clear transparency around why my PR wasn’t merged, no transparency about when something that does work can be expected or even what the plan is even without a date, the whole thing is just in limbo.
So this prompted me to release my ~ 100 lines so people can get the feature if they want
Thanks for clearing that up. That is indeed confusing and certainly doesn’t make me happy. I totally agree with where you want to take things and I’m not sure that Puppet Labs is going there at all.
Are there any news for this?
I’d really like to see this feature being officially supported by puppetlabs.
First of we are using Puppet only on RedHat at the moment so we don’t really have the osfamily issue yet… But in restructuring my modules again I have ended up with an empty params.pp, I really can’t be bothered coding references form init.pp to params.pp. It is a waist of time and interferes with understanding what module is doing. At the same time I created a ::config::params.pp that holds all derived parameters.
So in short:
– I declare my configurable variables in init.pp and provide default values where appropriate
– All overrides are through Hiera or Foreman
– Derived parameters are in ::config::params.pp
– Class declarations don’t have additional parameters
– Define declarations only have a minimum number of variables to create unique resources.
So yes separating config data from the script-code is essential but the debate on where to store the data is not finished yet. Hiera is nice and simple, but files and folders are confusing, and no doubt we will get to the old hierarchical vs. relational or multi-inheritance object model discussions when the complexity of the data increases. Foreman adds a database and some abstraction by providing the administrator to control which variables can be overridden creating a nice separation of control between the puppet coder, the environment administrator and the tech-lead that needs an environment build to his specifications. Unfortunately our organization is even more complex than that, so one group uses Foreman and the other Hiera…
There is now native support for this in Puppet 4 https://www.devco.net/archives/2016/01/08/native-puppet-4-data-in-modules.php