How I Classify Puppet Nodes

The basics of defining what modules get applied to a particular node is really simple in Puppet. Out of the box you just use the hostname and the FQDN and everyone is happy. You find this everywhere in documentation, blog posts, presentations, etc. However is has a problem: scale.

What if you have an elastic infrastructure with nodes being created and destroyed automatically? What if you want to use the same manifests in different environment, but use different hostnames? What if you have stupidly complex host naming conventions that you cannot get your head round (current day job problem for me :-( )?

In all these cases and more, using the hostname to classify the node falls down. I like to add in Role that can then be access in 2 ways. With Hiera, one could do something like:

  - "nodes/%{::trusted.certname}"
  - "roles/%{role}"
  - "%{environment}"
  - "%{osfamily}-osreleasemajor"
  - global

And with in site.pp we can add in a simple case statement:

node default {
  case $::role {
    'loadbalancer': {
      class { 'haproxy': }
    'db': {
      class { 'mysql': }
    default: {
      notify('no specific classes assigned')
  class { 'security': }

Now, we can still classify nodes individually but there is something in between the wider environment and OS categories that we can define ourselves. Of course we now need to define the role, which is everywhere from simple to complex or even not completely clear in my head for now.

I create a custom role fact that my manifests will look at. This is universal, no matter what mechanism is used to populate that fact that is the only place I will search in my Puppet code.

When your nodes are under Openstack or EC2, this is simple. They both have the concept of user-defined metadata as key-value pairs. I simple add a role pair:

nova meta <instance-id> set role=loadbalancer

You can also set this when you create the instance.

nova boot --meta role=loadbalancer --<other-settings> <hostname>

Now we just need the fact to look it up.

require 'net/http'
require 'json'
require 'uri'

module RoleModule
  def self.add_facts
    Facter.add("role") do
      productname = Facter.value(:productname)
      case productname
      when 'OpenStack Nova'
        setcode do
          url= ""
          uri = URI.parse(url)
          http =,uri.port)
          response = http.get(uri.path)
      when 'ProLiant MicroServer'
        setcode do

What is happening here? First it checks the productname fact so it can work out what to do. If that is OpenStack Nova then it knows that is needs to look in the Openstack Metadata service ( Our key/value pair is returned as part of that JSON data and is pushed in to the role fact.

Likewise, if the productname is an HP Microserver, it will always be a lab compute node (in my case).

Physical machines otherwise fall down here. There is no way to dynamically modify their role, but I have a couple of solutions:

The important part with both of these is the classification is totally seperate from my Puppet code.