Hiera: a pluggable hierarchical data store


Note: This project is now being managed by Puppetlabs, its new home is http://projects.puppetlabs.com/projects/hiera

In my previous post I presented a new version of extlookup that is pluggable. This is fine but it’s kind of tightly integrated with Puppet and hastily coded. That code works – and people are using it – but I wanted a more mature and properly standalone model.

So I wrote a new standalone non-puppet related data store that takes the main ideas of using Hierarchical data present in extlookup and made it generally available.

I think the best model for representing key data items about infrastructure is using a Hierarchical structure.

The image above shows the data model visually, in this case we need to know the Systems Administrator contact as well as the NTP servers for all machines.

If we had production machines in dc1, dc2 and our dev/testing in our office this model will give the Production machines specific NTP servers while the rest would use the public NTP infrastructure. DC1 would additional have a specific Systems Admin contact, perhaps it’s outsourced to your DR provider.

This is the model that extlookup exposed to Puppet and that a lot of people are using extensively.

Hiera extracts this into a standalone project and ships with a YAML backend by default, there are also JSON and Puppet ones available.

It extends the old extlookup model in a few key ways. It has configuration files of it’s own rather than rely on Puppet. You can chain multiple data sources together and the data directories are now subject to scope variable substitution.

The chaining of data sources is a fantastic ability that I will detail in a follow up blog post showing how you would use this to create reusable modules and make Puppet parametrized classes usable – even without an ENC.

It’s available as a gem using the simple gem install hiera and the code is on GitHub where there is an extensive README. There is also a companion project that let you use JSON as data store – gem install hiera-json. These are the first Gems I have made in years so no doubt they need some love, feedback appreciated in GitHub issues.

Given the diagram above and data setup to match you can query this data from the CLI, examples of the data is @ GitHub:

$ hiera ntpserver location=dc1

If you were on your Puppet Master or had your node Fact YAML files handy you can use those to provide the scope, here the yaml file has a location=dc2 fact:

$ hiera ntpserver --yaml /var/lib/puppet/yaml/facts/example.com

I have a number of future plans for this:

  • Currently you can only do priority based searches. It will also support merge searches where each tier will contribute to the answer. The answer will be a merged hash
  • A Puppet ENC should be written based on this data. This will require the merge searches mentioned above.
  • More backends
  • A webservice that can expose the data to your infrastructure
  • Tools to create the data – this should really be Foreman and/or Puppet Dashboard like tools but certainly CLI ones should exist too.

I have written a new Puppet backend and Puppet function that can query this data. This has turned out really great and I will make a blog post dedicated to that later, for now you can see the README for that project for details. This backend lets you override in-module data supplied inside your manifests using external data of your choice. Definitely check it out.