Back in August 2012 I requested an enhancement to the general data landscape of Puppet and a natural progression on the design of Hiera to enable it to be used in modules that are shared outside of your own environments. I called this Data in Modules. There was lots of community interest in this but not much movement, eventually I made a working POC that I released in December 2013.
The basic idea around the feature is that we want to be able to use Hiera to model internal data found in modules as well as site specific data and that these 2 sets of data coexist and compliment each other. Full details of this can be found in my post titled Better Puppet Modules Using Hiera Data and some more background can be found in The problem with params.pp. These posts are a bit old now and some things have moved on but they’re good background reading.
It’s taken a while but as part of the Puppet 4 rework effort the data ingesting mechanisms have also been rewritten in finally in Puppet 4.3.0 native data in modules have arrived. The original Jira for this is 4474. It’s really pretty close to what I had in mind in my proposals and my POC and I am really happy with this. Along the way a new function called lookup() have been introduced to replace the old collection of hiera(), hiera_array() and hiera_hash().
The official docs for this feature can be found at the Puppet Labs Docs site. Here I’ll more or less just take my previous NTP example and show how you could use the new Data in Modules to simplify it as per the above mentioned posts.
This is the very basic Puppet class we’ll be working with here:
class ntp ( String $config, String $keys_file ) { ... } |
In the past these variables would have needed to interact with the params.pp file like $config = $ntp::params::config, but now it’s just a simple class. At this point it’ll not yet use any data in the module, to do that you have to activate it in the metadata.json:
# ntp/metadata.json { ... "data_provider": "hiera" } |
At this point Puppet knows you want to use the hiera data in the module. But key to the feature and really the whole reason it exists is because a module needs to be able to specify it’s own hierarchy. Imagine you want to set $keys_file here, you’ll have to be sure the hierarchy in question includes the OS Family and you must have control over that data. In the past with the hierarchy being controlled completely by the site hiera.yaml this was not possible at all and the outcome was that if you wanted to share a module outside of your environment you have to go the params.pp route as that was the only portable solution.
So now your modules can have their own hiera.yaml. It’s slightly different from the past but should be familiar to past hiera users, it goes in your module so this would be ntp/hiera.yaml:
--- version: 4 datadir: data hierarchy: - name: "OS family" backend: yaml path: "os/%{facts.os.family}" - name: "common" backend: yaml |
This is the new format for the hiera configuration, it’s more flexible and a future version of hiera will have some changing semantics that’s quite nice over the original design I came up with so you have to use that new format here.
Here you can see the module has it’s own OS Family tier as well as a common tier. Lets see the ntp/data/common.yaml:
--- ntp::config: "/etc/ntp.conf" ntp::keys_file: "/etc/ntp.keys" |
These are sane defaults to use for any non specifically supported operating systems.
Below are examples for AIX and Debian:
# data/os/AIX.yaml --- ntp::config: "/etc/ntpd.conf" |
# data/os/Debian.yaml --- ntp::keys_file: "/etc/ntp/keys" |
At this point the need for params.pp is gone – at least in this simplistic example – and this data along with the environment specific or site specific data cohabit really nicely. If you specified any of these data items in your site Hiera data your site data will override the module. The advantages of this might not be immediately obvious. I have a very long list of advantages over params.pp in my Better Puppet Modules Using Hiera Data post, be sure to read that for background.
There’s an alternative approach where you write a Puppet function that returns a hash of data and the data system will fetch the keys from there. This is really powerful and might end up being a interesting solution to something along the lines of a module specific custom hiera backend – but a lighter weight version of that. I might write that up later, this post is already a bit long.
The remaining problem is to do with data that needs to be merged as traditionally Hiera and Puppet has no idea you want this to happen when you do a basic lookup – hence these annoying hiera_hash() functions etc – , there’s a solution for this and I’ll post a blog post about that next week once the next Puppet 4 release is out and a bug I found that makes it unusable is fixed in that version.
This feature is a great addition to Puppet and I am really glad to finally see this land. My hacky modules in data code was used quite extensively with 72 000 downloads from the forge but I was never really happy with it and was desperate to see this land natively. This is a big step forward and I hope it sees wide adoption in the community.
A note about the old ripienaar-module_data module
As seen above the new built in feature is great and a very close match to what I had envisioned when creating the proof of concept module.
It would not be a good idea to support both these methods on Puppet 4 and turns out it is also quite difficult because we both use the hiera.yaml file in the module but with small differences in format. So the transition period will no doubt be a bit painful especially for those attempting to use this while supporting both Puppet 3 and 4 users.
Further the old module actually broke the Puppet 4 feature for a while in a way that was really difficult to debug. Puppet Labs kindly reached out and notified me of this and helped me fix it in MODULES-3102. So there is now a new release of the old module that works again on Puppet 4 BUT it warns very loudly that this is a bad idea.
The old module is now deprecated and unsupported. You should stop using it and imho stop using Puppet 3, but whatever you do stop using it on Puppet 4. I wish the metadata.json supported a supported Puppet version requirement so I can force this but alas it doesn’t so I can’t.
I will after a few months make a release that will raise an error on Puppet 4 and refuse to work there. You should move forward and adopt the excellent native implementation of this feature.
Is the fact for OS family really facts.os.family? Should it be %{::osfamily}?
No. Puppet 4 facts are a hash like $facts[os][family]
OK thanks, does it need a quote at the end?
path: “os/%{facts.os.family}”
vs.
path: “os/%{facts.os.family}
yes, I fixed that thanks
Is there any trick to getting this working from a puppetserver? It seems to work from Vagrant with puppet apply but not from puppetserver. I am running puppet 4.3.2 and puppetserver 2.2.1-1 on the server but I haven’t updated hiera.yaml. I am trying to use a forked version of inkblot/puppet-bind which used to use your module_data.
It should work though I didn’t try with server. Might need a restart after some changes
Is there some trick to getting it to load the new hiera.yaml files? I’m trying to use them on an environment, and when I puppet apply –environment=production, the ./environments/production/hiera.yaml never gets used
You should set “environment_data_provider = hiera” in environment.conf
Hi,
Thanks for your module, it was quite helpful. I currently try to switch from your module to official support on Puppet 4.
I successfully get my data module to work on Puppet 4 but only without classical hiera.
As soon as I create a hiera.yml in my project and add some local hiera files to complete the missing module data, it fails to search the additionnal settings set in the module data.
With your module, I was specifying an additionnal backend in my global hiera.yml file but If read the documentation as necessary this now the default behavior in puppet without additionnal settings.
Except the files in the module, should I configure my project specifically or pay attention to some things to get the override / fusion process to work ?
Thanks for your help,
Take careful note, the hiera.yaml is now in a different place and slightly different format, did you definitely put it in the right place?
puppet lookup CLI can help – on verbose it shows what its reading etc
Many thanks for taking time to answer my post.
I finally got it, and it was far away where I was looking.
It was simply a deficient puppet installation… (I uninstall and reinstall everything and ensure that the version installed is well 4.4.2)
Sorry for having busy you with that…
Hi,
I made the changes, edited metadata.json, add hiera.yaml under the module and have the common.yaml under /data. But when I tried puppet apply –debug, it throws error like
Error: Evaluation Error: Error while evaluating a Function Call, Class[my_module]: expects a value for parameter ‘my_message’ at /Users/user01/Lab/PuppetWS/my_module/tests/init.pp:12:1 on node user01-macbook-pro.local.
Am I missing anything?
Thanks for your help
probably, but really not enough to go on based on what you’re saying.
Perhaps better to make a gist or pastebin and send it to the mailing lists where you can give much more information of exact paths, file contents and so forth.
Resolved. Actually my module was pointed differently (2 copies was there). Anyway really well explanation on hiera. Thanks so much.
One more query. The module data hiera is perfectly working on Mac and Unix, but not on windows. Do we have any restrictions or I am missed anything. The error I am getting is Error: Evaluation Error: Error while evaluating a Function Call, Could not find data item …
Thanks
Compilation happens on the master. So more likely your problem is exactly like it says. It can’t find the data in the hierarchy.
Oh Again. Its working. I need pass the modulepath.
Hi,
Can we explicitly call and get the data from this module hiera data than automatic lookup, like we did for environmenta/nodelevel hiera like hiera(“key”)
yes, use lookup(), see https://www.devco.net/archives/2016/03/13/the-puppet-4-lookup-function.php
Am I correct in thinking that this only supports json and yaml backends?
I’m interested in using it to provide data for my site specific “profiles” module, but we primarily use the eyaml backend as it provides encryption for secrets.
correct, I think there are plans to create a eyaml compatible new backend though.
That said it seems weird that your module data would have eyaml stuff in, that sounds like site/environment data instead of module data, but I guess you might have your reasons.
I think I am missing the point here. Is this just to get rid of a couple of case statements in the params file?
Before you had all your configuration in one place with some simple case statements and some clearly defined variables.
Now you have to parse the hiera.yaml to work out which set of files the data is stored in, then open all those files and compile a mental map of all the data then map that back to the code that is running.
If the ntp key moved in Debian v9 and above, in the param.pp you could add an easy to read if statement in the same location. In the hiera example would you have to create a new level in the hierarchy which then has a file for v9, v10 etc?
Maybe it’s just me who sees the “spaghetti” code as something that is much easier to parse than the meatballs on the floor set of files we just created in hiera. 🙂
You don’t have to use it 🙂 But see the link about “The problem with params.pp” for detail, on a busy module with many contributors and so forth, it really isn’t all that easy to manage params.pp.
There are benefits around the doc generation tools being able to introspect the module to see what it wants, data types wise and merging etc, so automatic docs can be generated that reflects reality MUCH better than could be before, since those cant parse your case statements.
I’m trying to grok this still so I have had a small go at laying out the hiera.yaml for the ntp module.
Part of the code in params selects based on Operatingsystem, and for SLES and OpenSuse there are different case statements using major release and release as the keys.
My first attempt:
name: Operatingsystem family
path: “os/%{facts.os.family}”
name: “Os Major relesase”
path: “os/release/major/%{facts.os.majorrelease}”
name: “Os release”
path: “os/release/%{facts.os.release}”
But that would mean you could clashes based on different os with the same version etc.
name: Operatingsystem family
path: “os/%{facts.os.family}”
name: “Os Major release”
path: “os/%{facts.os.family}/%{facts.os.majorrelease}”
name: “Os release”
path: “os/%{facts.os.family}/%{facts.os.release}”
Would I end up with “os/%{facts.os.family}/name/%{facts.os.name}/release/%{facts.family.os.release}” to avoid possible overlaps or am I missing things altogether? I assume I don’t want to set data for every major “12” release so I need to be clear that it’s Suse “12” not Fedora.
Is this then not simply a case statement by another name? I am listing out each branch of the case statement and putting a bit of data in each list. I have just replaced a case statement with a directory structure.
Is there a simpler way to do the hiera.yaml layout in this case?
Also the “Your not supported” case, is that just set in the common.yml and overridden in the supported cases?
I can see some of the use cases mentioned in the problem with params.pp post being relevant to a case where you want to override values locally but even with the params.pp code it is still possible. I don’t think what params.pp is doing is data but code. There are decisions being made code is being run. By pretending that it is data you have moved code into the hiera.yaml and now you have code in your hiera data.
See https://github.com/DavidS/puppetlabs-ntp/tree/ntp-next where this is already done
An interesting example but I can’t see any of the benefits that you where talking about in your problem with params.pp file in that code. Other than maybe if you really don’t like reading puppet code.
The hiera.yaml has replaced the case statement in the code with a cascading if statement. The variables are now moved out into a set of arbitrarily named files.
Just as before, the user of the module can override what they like by setting class parameters on the module. They can do that in their own code or using their own data source.
I know you have a lot more experience of this than me and I am honestly not trying to troll you but I am really struggling to see the benefits of this particular use of hiera.
That’s fine each to their own. Many people don’t like Hiera on a base level. Others do. You can pick what you like.
Data in modules resonate with many esp when combined with lookup options and types for others is just too much effort.
Pick what you like and works for you.