{"id":2112,"date":"2011-06-05T16:17:19","date_gmt":"2011-06-05T15:17:19","guid":{"rendered":"http:\/\/www.devco.net\/?p=2112"},"modified":"2011-10-27T23:29:39","modified_gmt":"2011-10-27T22:29:39","slug":"hiera_a_pluggable_hierarchical_data_store","status":"publish","type":"post","link":"https:\/\/www.devco.net\/archives\/2011\/06\/05\/hiera_a_pluggable_hierarchical_data_store.php","title":{"rendered":"Hiera: a pluggable hierarchical data store"},"content":{"rendered":"

Note:<\/strong> This project is now being managed by Puppetlabs, its new home is http:\/\/projects.puppetlabs.com\/projects\/hiera<\/a><\/p>\n

In my previous post I presented a new version of extlookup that is pluggable<\/a>. This is fine but it’s kind of tightly integrated with Puppet and hastily coded. That code works – and people are using it – but I wanted a more mature and properly standalone model.<\/p>\n

So I wrote a new standalone non-puppet related data store that takes the main ideas of using Hierarchical data present in extlookup and made it generally available.<\/p>\n

I think the best model for representing key data items about infrastructure is using a Hierarchical structure.<\/p>\n

<\/center><\/p>\n

The image above shows the data model visually, in this case we need to know the Systems Administrator contact as well as the NTP servers for all machines. <\/p>\n

If we had production machines in dc1, dc2 and our dev\/testing in our office this model will give the Production machines specific NTP servers while the rest would use the public NTP infrastructure. DC1 would additional have a specific Systems Admin contact, perhaps it’s outsourced to your DR provider.<\/p>\n

This is the model that extlookup exposed to Puppet<\/a> and that a lot of people are using extensively. <\/p>\n

Hiera extracts this into a standalone project and ships with a YAML backend by default, there are also JSON and Puppet ones available.<\/p>\n

It extends the old extlookup model in a few key ways. It has configuration files of it’s own rather than rely on Puppet. You can chain multiple data sources together and the data directories are now subject to scope variable substitution.<\/p>\n

The chaining of data sources is a fantastic ability that I will detail in a follow up blog post showing how you would use this to create reusable modules and make Puppet parametrized classes usable – even without an ENC.<\/p>\n

It’s available as a gem using the simple gem install hiera<\/em> and the code is on GitHub where there is an extensive README<\/a>. There is also a companion project that let you use JSON as data store – gem install hiera-json<\/em>. These are the first Gems I have made in years so no doubt they need some love, feedback appreciated in GitHub issues.<\/p>\n

Given the diagram above and data setup to match you can query this data from the CLI, examples of the data is @ GitHub:<\/p>\n

<\/p>\n

\r\n$ hiera ntpserver location=dc1\r\nntp1.dc1.example.com\r\n<\/pre>\n

<\/code><\/p>\n

If you were on your Puppet Master or had your node Fact YAML files handy you can use those to provide the scope, here the yaml file has a location=dc2<\/em> fact:<\/p>\n

<\/p>\n

\r\n$ hiera ntpserver --yaml \/var\/lib\/puppet\/yaml\/facts\/example.com\r\nntp1.dc2.example.com\r\n<\/pre>\n

<\/code><\/p>\n

I have a number of future plans for this:<\/p>\n