Shiny new things in Puppet 4


Puppet 4 has been out a while but given the nature of the update – new packaging requiring new modules to manage it etc I’ve been reluctant to upgrade and did not have the time really. Ditto for Centos 7. But Docker will stop supporting Centos 6 Soon Now so this meant I had to look into both a bit closer.

Puppet 4 really is a whole new thing, it maintains backward compatibility but really in terms of actually using its features I think you’d be better off just starting fresh. I am moving the bulk of my services out of CM anyway so my code base will be tiny so not a big deal for me to just throw it all out and start fresh.

I came across a few really interesting new things amongst it’s many features and wanted to highlight a few of these. This is by no means an exhaustive list, it’s just a whirlwind tour of a few things I picked up on.

The Forge

Not really a Puppet 4 thing per se but more a general eco system comment. I have 23 modules in my new freshly minted Puppet repo with 13 of them coming from the forge. To my mind that is a really impressive figure, it makes the new starter experience so much better.

Things I still do on my own: exim, iptables, motd, pki, roles/profiles of course and users.

In the case of exim I have almost no config, it’s just a package/config/service and all it does is setup a local config that talks to my smart relays. It does use my own CA though and that’s why I also have my own PKI module to configure the CA and distribute certs and keys and such. The big one is iptables really and I just haven’t had the time to really consider a replacement – whatever I choose it needs to play well with docker and that’s probably going to be a tall order.

Anyway, big kudos on the forge team and shout outs to forge users: puppetlabs, jfryman, saz and garethr.

Still some things to fix – puppet module tool is pretty grim wrt errors and feedback and I think there’s work left to do on discoverability of good modules and finding ways to promote people investing time in making better ones, but this is a big change from 2 years ago for sure.

Puppet 4 Type System

Puppet 4 has a data type system, it’s kind of optional which is weird as things go but you can almost think of it like a built in way to do validate_hash and friends, almost. The implications of having it though are huge – it means down the line there will be a lot fewer edge cases with things just behaving weirdly.

Data used to go from hiera to manifests and ending up strings when the data was Boolean now Puppet knows about actual Booleans and does not mess it up – things will become pretty consistant and solid and it will be easy to write well behaved code.

For now though it’s the opposite, there are many more edge cases as a result of it.

Particularly functions that previously took a number and did something with it might have assumed the number was a string with a number in it. Now it’s going to get an actual number and this causes breakage. There are a few of these in stdlib but they are getting fixed – expect this will catch out many templates and functions so there will be a settling in period but it’s well worth the effort.

Here’s an example:

define users::user(
  Enum["present", "absent"] $ensure       = "present",
  Optional[String] $ssh_authorized_file   = undef,
  Optional[String] $email                 = undef,
  Optional[Integer] $uid                  = undef,
  Optional[Integer] $gid                  = undef,
  Variant[Boolean, String] $sudoer        = false,
  Boolean $setup_shell                    = false,
  Boolean $setup_rbenv                    = false
) {

If I passed ensure => bob to this I get:

Error: Expected parameter 'ensure' of 'Users::User[rip]' to have type Enum['present', 'absent'], got String

Pretty handy though the errors can improve a lot – something I know is on the road map already.

You can get pretty complex with this like describe the entire contents of a hash and Puppet will just ensure any hash you receive matches this, doing this would have been really hard even with all the stuff in old stdlib:

Struct[{mode            => Enum[read, write, update],
        path            => Optional[String[1]],
        NotUndef[owner] => Optional[String[1]]}]

I suggest you spend a good amount of time with the docs About Values and Data Types, Data Types: Data Type Syntax and Abstract Data Types. There are many interesting types like ones that do Pattern matching etc.

Case statements and Selectors have also become type aware as have normal expressions to test equality etc:

$enable_real = $enable ? {
  Boolean => $enable,
  String  => str2bool($enable),
  Numeric => num2bool($enable),
  default => fail('Illegal value for $enable parameter'),
if 5 =~ Integer[1,10] {
  notice("it's a number between 1 and 10")

It’s not all wonderful though, I think the syntax choices are pretty poor. I scan parameter lists: a) to discover module features b) to remind myself of the names c) to find things to edit. With the type preceding the variable name every single use case I have for reading a module code has become worse and I fear I’ll have to resort to lots of indention to make the var names stand out from the type definitions. I cannot think of a single case where I will want to know the variable data type before knowing it’s name. So from a readability perspective this is not great at all.

Additionally I cannot see myself using a Struct like above in the argument list – to which Henrik says they are looking to add a typedef thing to the language so you can give complex Struc’s a more convenient name and use that. This will help that a lot. Something like this:

type MyData = Struct[{ .... }]
define foo(MyData $bar) {

That’ll be handy and Henrik says this is high on the priority list, it’s pretty essential from a usability perspective.

UPDATE: As of 4.4.0 this has been delivered, see Puppet 4 Type Aliases

Native data merges

You can merge arrays and hashes easily:

$ puppet apply -e '$a={"a" => "b"}; $b={"c" => "d"}; notice($a+$b)'
Notice: Scope(Class[main]): {a => b, c => d}
$ puppet apply -e 'notice([1,2,3] + [4,5,6])'
Notice: Scope(Class[main]): [1, 2, 3, 4, 5, 6]

And yes you can now use a ; instead of awkwardly making new lines all the time for quick one-liner tests like this.

Resource Defaults

There’s a new way to do resource defaults. I know this is a widely loathed syntax but I quite like it:

file {
    mode   => '0600',
    owner  => 'root',
    group  => 'root',
    ensure => file,
    mode => '0644',

The specific mode on /etc/ will override the defaults, pretty handy. And it address a previous issue with old style defaults that they would go all over the scope and make a mess of things. This is confined to just these files.

Accessing resource parameter values

This is something people often ask for, it’s seems exciting but I don’t think it will be of any practical use because it’s order dependant just like defined().

notify{"hello": message => "world"}
$message = Notify["hello"]["message"]  # would be 'world'

So this fetches another resource parameter value.

You can also fetch class parameters this way but this seems redundant, there are several ordering caveats so test your code carefully.


This doesn’t really need comment, perhaps only OMFG FINALLY is needed.

["puppet", "facter"].each |$file| {
    ensure => "link",
    target => "/opt/puppetlabs/bin/${file}"

More complex things like map and select exist too:

$domains = ["", ""]
$domain_definition = $domains.reduce({}) |$memo, $domain| {
  $memo + {$domain => {"relay" => "mx.${domain}"}}

This yields a new hash made up of all the parts:

 "" => {"relay" => ""},
 "" => {"relay" => ""}

See Iterating in Puppet for more details on this.

. syntax

If you’re from Ruby this might be a bit more bearable, you can use any function interchangably it seems:

$x = join(["a", "b"], ",")
$y = ["a", "b"].join(",")

Both result in a,b

Default Ordering

By default it now does manifest ordering. This is a big deal, I’ve had to write no ordering code at all. None. Not a single require or ordering arrows. It’s just does things top down by default but parameters like notifies and specific requires influence it. Such an amazingly massive time saver. Good times when things that were always obviously dumb ideas goes away.

It’s clever enough to also do things in the order they are included. So if you had:

class myapp {
  include myapp::install
  include myapp::config
  include myapp::service

Ordering will magically be right. Containment is still a issue though.

Facts hash

Ever since the first contributor summit I’ve been campaigning for $facts[“foo”] and it’s gone all round with people wanting to invent some new hash like construct and worse, but finally we have now a by default enabled facts hash.

Unfortunately we are still stuck with $settings::vardir but hopefully some new hash will be created for that.

It’s a reserved word everywhere so you can safely just do $facts[“location”] and not even have to worry about $::facts, though you might still do that in the interest of consistency.

Facter 3

Facter 3 is really fast:

$ time facter
facter  0.08s user 0.03s system 44% cpu 0.248 total

This makes everything better. It’s also structured data but this is still a bit awkward in Puppet:

$x = $facts["foo"]["bar"]["baz"]

There seems to be no elegant way to handle a missing ‘foo’ or ‘bar’ key, things just fail badly in ways you can’t catch or recover from. On the CLI you can do facter so we’re already careful to not have “.” in a key. We need some function to extract data from hashes like:

$x = $facts.fetch("", "default")

It’ll make it a lot easier to deal with.

Hiera 3

Hiera 3 is out and at first I thought it didn’t handle hashes well, but it does:

  - "%{facts.fqdn}"
  - "location_%{facts.location}"
  - "country_%{}"
  - common

That’s how you’d fetch values out of hashes and it’s pretty great. Notice I didn’t do ::facts that’s because facts is reserved so there’ll be no scope layering issues.

Much better parser

You can use functions almost everywhere:

$ puppet apply -e 'notify{hiera("rsyslog::client::server"): }'

There are an immeasurable amount of small improvements in things the old parser did badly, now it’s really nice to use, things just work the way I expect them to do from other languages.

Even horrible stuff like this works:

$x = hiera_hash("something")["foo"]

Which previously needed an intermediate variable.

puppet apply –test

A small thing but –test in apply now works like in agent – colors, verbose etc, really handy.

Data in Modules

I did a PoC to enable Hiera in modules a few years ago and many people loved the idea. This has finally landed in recent Puppet 4 versions and it’s pretty awesome. It lets you have a data directory and hiera.yaml in your module, this goes some way towards removing what is currently done with params.pp

I wrote a blog post about it: Native Puppet 4 Data in Modules. An additional blog post that covers this is params.pp in Puppet 4 which shows how it ties together with some other new things.

Native create_resources

create_resources is a hack that exists because it was easier to hack up than fix Puppet. Puppet has now been fixed, so this is the new create_resources.

each($domains) |$name, $domain| { 
    * => $domain

See Iterating in Puppet for extensive examples.

No more hiera_hash() and hiera_array()

There’s a new function called lookup() that’s pretty powerful. When combined with the new Data in Modules feature you can replace these functions AND have your automatic parameter lookups do merges.

See Puppet 4 data lookup strategies for an extensive look at these

Native language functions

You can now write name spaced functions using the Puppet DSL:

function site::fetch(
  Hash $data,
  String $key,
) {
  if $data[$key] {
  } else {

And you’ll use this like any function really:

$h = {}
$item = site::fetch($h, "thing", "default") # $item is now 'default'

It also has a splat argument handler:

function site::thing(*$args) {
$list = site::thing("one", "two", "three") # $list becomes ["one", "two", "three"]

AIO Packaging

Of course by now almost everyone know we’re getting omnibus style packaging. I am a big supporter of this direction, the new bundled ruby is fast and easy to get onto older machines.

The execution of this is unspeakably bad though. It’s so half baked and leave so much to be desired.

Here’s a snippet from the current concat module:

    if defined('$is_pe') and str2bool("${::is_pe}") { # lint:ignore:only_variable_string
      if $::kernel == 'windows' {
        $command_path = "${::env_windows_installdir}/bin:${::path}"
      } else {
        $command_path = "/opt/puppetlabs/puppet/bin:/opt/puppet/bin:${::path}"
    } elsif $::kernel == 'windows' {
      $command_path = $::path
    } else {
      $command_path = "/opt/puppetlabs/puppet/bin:${::path}"
       path => $command_path

There are no words. Without this abomination it would try to use system ruby to run the #!/usr/bin/env ruby script. Seriously, if something ships that cause this kind of code to be written by users you’ve failed. Completely.

Things like the OS not being properly setup with symlinks into /usr/bin – can kind of understand it to avoid conflicts with existing Puppet, but meh, it just makes it feel unpolished and as if it comes without batteries included the RPM conflicts with puppet < 4.0.0 so it’s not that, it’s just comes without batteries included.

The file system choices are completely arbitrary:

# puppet apply --configprint vardir

This is intuitive to exactly no-one who has ever used any unix or windows or any computer.

Again, I totally support the AIO direction but the UX of this is so poor that while I’ve been really positive about Puppet 4 up to now I’d say this makes the entire thing be Alpha quality. The team absolutely must go back to the drawing board and consider how this is done from the perspective of usability by people who have likely used Unix before.

Users have decades of experience to build on and the system as a whole need to be coherent and compliment them – it should be a natural and comfortable fit. This and many other layout choice just does not make sense. Sure the location is arbitrary it makes no technical different if it’s in /opt/puppetlabs/puppet/cache or some other directory.

It DOES though make a massive difference cognitively to users when thinking of the option vardir and think of their entire career experience of what that mean and then cannot for the life of them find the place these files go without having to invest effort in finding it and then having to remember it as a odd one out. Even knowing things are in $prefix you still can’t find this dir because it’s now been arbitrarily renamed to cache and instead of using well known tools like find I now have to completely context switch.

Not only is this a senseless choice but frankly it’s insulting that this software seems to think it’s so special that I have to remember their crappy paths differently from any of the other 100s of programs out there. It’s not, it’s just terrible and makes it a nightmare to use. Sure put the stuff in /opt/puppetlabs, but don’t just go and make things up and deviate from what we’ve learned over years of supporting Puppet. It’s an insult.

Your users have invested countless hours in learning the software, countless hours in supporting others and in some cases Paid for this knowledge. Arbitrarily changing vardir to mean cache trivialise that investment and puts a unneeded toll on those of us who support others in the community.


There’s a whole lot more to show about Puppet 4, I’ve only been at it for a few nights after work but overall I am super impressed by the work done on Puppet Core. The packaging lets the efforts down and I’d be weary of recommending anyone go to Puppet 4 as a result, it’s a curiosity to investigate in your spare time while hopefully things improve on the packaging front to the level of a production usable system.