Chapter 4. Puppet Module Design

Puppet modules are self-contained bundles of code and data. A module extends Puppet features with any number of the following optional components:

  • Manifests written in the Puppet language

  • Files and templates used by the module

  • Facts about the node

  • Parsing and manipulation functions

  • New resource types

  • New providers for existing resource types

  • Module-specific data files or providers

  • Unit and acceptance tests and test fixtures

This chapter focuses on module manifests containing Puppet language code and their resources: files, templates, metadata, and tests. We cover these additional aspects of module design in the following chapters:

  • Creation of modules for roles and profiles in Chapter 7

  • Distribution and deployment of modules in Chapter 9

  • Extending modules with plugins in Chapter 10

The Puppet Development Kit

This section examines the minimum set of tools necessary for installing, testing, and extending Puppet modules.

Installing the Puppet Agent

It might seem obvious, but Puppet novices often forget the value of installing the Puppet agent on their developer machines. Testing locally with Puppet can catch common errors that would otherwise result in a long push, run, fail, repeat-until-success cycle.

Using the Ruby that Comes Bundled with Puppet

If you’ve been working on Puppet for many years, you probably have configured a wide array of Ruby environment managers for testing your Puppet code. If you are new to all of this, we have good news: This is no longer necessary. If you don’t maintain other Ruby projects, ignore old instructions that tell you to install Ruby Version Manager (RVM) or rbenv and multiple Ruby interpreters on your workstation. The Puppet Development Kit (PDK) has made that environment setup obsolete.

Installing the Puppet Development Kit

The PDK provides all of the Ruby environment, dependency management, and testing tools necessary for testing Puppet code. You can download the PDK here.

All of the following instructions that you might find on the internet have been rendered obsolete by the PDK:

  • You no longer need to manually install Ruby gems for testing. Add them to the module’s Gemfile, and the PDK will install them for you.

  • You no longer need to run bundle install or prefix each test command with bundle exec.

  • You no longer need to run puppet parser validate, puppet-lint, or rubocop. pdk validate does all of that and more.

  • You no longer need to find and add Puppet testing libraries to your Gemfile. The PDK installs every one we’ve used before.

  • You no longer need to define standard node facts for rspec; the PDK includes a complete set of node facts for you and provides a single file to add your own customizations.

  • You no longer need to build test environments for each version of the Puppet agent and/or Ruby. The PDK bundles and tests all supported versions.

In short, you no longer need to be an expert in Ruby testing toolkits to test modules for a given purpose. For 90% of testing situations, the PDK will do everything you need right out of the box.

Favor Editors or IDEs with Puppet Plugins

You can use any IDE or text editor for Puppet development. Because the recommendations for all of them are basically the same, in the following, we mention only the things that you likely will want to utilize.

  • Enable syntax highlighting for Puppet, Ruby, JSON, YAML, and HOCON.

  • Install plugins for the editor that provide inline Puppet code validation.

  • Configure the editor to strip carriage returns automatically from Puppet, Ruby, JSON, HOCON, and YAML files.

  • Ensure that the editor can preserve existing line endings if you need to edit Windows configuration files on Linux, or vice versa.

If you have nodes from heterogenous environments, ensure that your editor can handle linefeed conversation properly.

Tip

At the time this was written, Visual Studio Code was the best free tool for editing Puppet code and data in Windows. It preserves the original line endings used in any file it opens.

Using Vendor-Provided or Community Modules

Before you begin the process of developing a new module, check what’s already available on the Puppet Forge. In many cases, someone might already have developed a module suitable for your needs. It’s almost always easier to write a profile that customizes one or more component modules from the Forge than to build your own from scratch. At the very least, it would give you a head start on development.

Modules from the Puppet Forge have a number of benefits:

  • You don’t need to spend time developing or updating the module.

  • Quality control is performed by a large community of users (not just you).

  • Bug fixes and new features developed by others appear without effort on your part.

  • Public modules often include good documentation and test scenarios.

For all of these reasons, begin by finding a module that does what you need. If you find the need to expand the module, submit your improvements back to the community.

Picking Good Modules

There are thousands of modules available on the Forge, and for the most part, Puppet does a good job of highlighting the best modules. Download statistics, ratings, platform support, test results, and module documentation are published.

There are many well-written modules on the Puppet Forge. There are also half-baked ideas that were never finished. The Forge is not a curated list; anyone can write and publish a module of any quality. The following markers provide indicators of quality design:

  • Modules marked supported are officially supported by Puppet and used in the Puppet Enterprise product. There are millions of users for these modules.

  • Modules marked approved are certified by Puppet to meet their standards for high quality. It also indicates that it is the best-in-class module for that particular need. Puppet does not approve two modules that do the same thing.

  • A high community score. This number is a star-vote by community members who have downloaded it.

  • A high code-quality score. This is an automated code-quality analysis, the detailed results of which you can read there.

Module Checklist

If you’ve found a large number of modules that may meet your needs, the following list provides key points to review when choosing a module for your needs:

Module completeness:

❏ Is the module well documented?

❏ Does the module include unit tests?

❏ Does the module include acceptance tests?

❏ Has the module received updates recently?

❏ Does the module support the operating systems or applications frameworks you use?

Code quality:

❏ Does the module appear to follow the single responsibility principle?

❏ Is the module supported or approved by Puppet?

❏ Does the module follow the Puppet Lab’s style guide?

❏ Does the module conform to module development best practices?

Local needs:

❏ Does the module license meet your requirements?

❏ Does the module pull in a lot of dependencies? Will those dependencies conflict with modules you are using today?

❏ Is the module source hosted in a public location (i.e., GitHub)? Do the authors allow you to submit pull requests back to them?

Testing/integration needs:

❏ Is the author responsive to pull requests and issues?

❏ Do the unit tests pass in your environment?

❏ Will the acceptance tests operate in your environment?

In general, if any one of these items is missing, this is not necessarily a deal-breaker, but if large numbers of your requirements are not met, it might be useful to investigate other modules.

Module Applicability to Your Needs

Sometimes, otherwise good modules simply won’t be suitable for your site, or your specific needs. It’s important to carefully consider your own use case when selecting a module. Following are some key questions to consider:

Platform support

Does the module support the platforms you have deployed?

If all nodes run CentOS Linux, there’s a high probability most published modules will work. If you must support Solaris or Windows nodes, your choices might be somewhat more limited. Although there are great multiplatform modules available, many published modules support only a few platforms. If the module doesn’t support your platform, consider how difficult it would be to wrap or extend the module. Otherwise, you can fork the module and submit changes back to the original author.

Features

Does the module support the application features you require?

Some modules manage only specific features of an application or service. Other modules can be incredibly comprehensive in their ability to manage every aspect of the application. It will be fairly easy to establish the difference between these by reading through the manifests.

Scaling

What are your scaling requirements and limitations? The design considerations that went into creating the module might create unnecessary complexity. Or the implementation used might not be flexible enough to scale to your deployment size. What is the sweet spot between enough features and brutally fast simplicity for needs?

Modules that rely on exported resources will require an investment in PuppetDB infrastructure to provide the resource storage and query interfaces. Is that infrastructure available everywhere you plan to use this?

If the module recursively synchronizes directories using Puppet file resources (which are processed in memory and superfast for small files or directories), it might not be suitable for synchronizing terabytes of files. A more specialized resource provider might be necessary for specialized needs.

The puppetlabs/apache module is very feature-complete and manages a lot of complexity to provide every Apache feature for any given application. This makes it very powerful and useful in nearly all scenarios. However, the complexity makes the module resource intensive to apply, which might be unsuitable for a large-scale bulk hosting provider that adds and removes a customer every few seconds.

Tip

The important point is to understand your needs. The vast majority of us need flexibility and features more than we need to dredge the last bit of performance gain from something that runs only periodically. Avoid taking on unnecessary effort and reinventing the wheel for a potentially unnecessary optimization.

Embracing and extending modules

If you find a module that meets most but not all of your requirements, you have two choices:

  • Write a wrapper module that depends on this module and then adds the missing bits.

  • Fork the module and add your extensions directly in the source tree.

You should contribute improvements upstream whenever possible. Code accepted into the mainline branch will be maintained and tested by the original author and community. Improvements accepted into the mainline release have a much lower risk of being broken by future releases.

Tip

The r10k code deployment tool simplifies switching between an upstream module and an independent fork. When local changes are accepted upstream, you can flip back to the original module transparently. For more instructions on module management with r10k deployments, see Chapter 9.

Contributing Modules

If you’ve written a new and interesting module or if you’ve improved significantly upon modules already available on the Forge, consider publishing your module. The puppet module utility will package your module for upload to the Puppet Forge.

Designing Modules Well

Good module design relates strongly to good code design. We touched on a number of coding principles and practices in Chapter 3, including separation of concerns (SoC), the single responsibility principle, KISS (Keep It Simple), and Interface-Driven Design. We put these principles into practice in this chapter.

Make Use of Module Structure

Puppet modules are self-contained bundles of code and data. Being self-contained allows modules to be portable:

  • The module occupies a single directory in the module path.

  • Its code is loaded upon request (declaration).

  • Files and templates specific to the module are stored within their respective directories.

  • Facts and functions required by the module are within its lib/ directory.

  • Data provided by the module can be referenced within the module namespace.

Each directory in the module contains specific files. This structure allows those features to be autoloaded. Although it’s possible to use files outside of that structure, it defeats expectations, confuses the reader, and could likely fail in unexpected ways when Puppet is upgraded.

Keep the Module Focused

Your modules should have a clearly defined purpose, and the use case should be likewise clearly documented. When building a module, ask, “Is this data or dependency part of the application or does it contain information specific to my business?” Use the answer to this question to determine which of the following types most appropriately matches the need:

Component modules

Component modules should minimize dependencies and contain only the application-specific data needed for generic situations. Component modules should be free of business logic or data.

Profile modules

Profile modules supply the business logic and data to instruct the application of component modules. Profile modules can have many dependencies on component modules.

Design Modules for Public Consumption

You should design every module as if you plan on releasing it to the public. This is not to say that you should release all modules to the public—it means that the design patterns for public modules are the design patterns for creating good modules in general. Even if you never intend to release them, it helps you to maintain consistency.

Designing modules for portability ultimately makes them simpler to support and extend. It helps to eliminate technical debt. The design patterns for creating public modules encourage reuse. Reuse means that you won’t need to rewrite the module from scratch every time your requirements or environment change.

Planning and Scoping Your Module

Before you begin writing your module, it’s important to first determine your module’s scope. For many of us, our first instinct is to write a jumbo module that installs and manages all of its dependencies. Unfortunately, this approach tends to create problems down the line; such modules are often inflexible and become difficult to maintain. They can create compatibility problems when they manage a dependency that’s needed by another module.

You should always design your modules using SoC (see “Separation of Concerns” and “The Single Responsibility Principle” as guidelines). As a rule of thumb, if a resource in your module might be declared in any other module it should probably be its own module.

Dependencies outside the scope of your module should be externalized into their own modules. Domain logic should be handled using the roles and profiles pattern described in Chapter 7.

Tip

In many cases you can rely on the package manager to handle dependencies.

The Java dependency for Tomcat is a classic example of a dependency that you should externalize as domain logic. By separating the management of Java from the management of Tomcat, the process of upgrading Java is simplified, and potential conflicts with other modules also attempting to deploy Java are eliminated.

Even if you never plan on ever distributing the module, you should design it as if you were. By designing your modules to be portable, you can adapt them to new requirements and new environments with minimal effort.

Basic Module Layout

A puppet module has a fairly standardized structure made up of a number of optional components. The pdk new module command populates a module skeleton with the following elements:

manifests/

Puppet manifests written in the Puppet DSL

facts.d/

Facts written in YAML or any language

files

Files made available through the modules Puppet file server mountpoint

templates/

ERB and EPP templates

examples/

Test manifests used for system testing, experimentation, and demonstration

spec/

RSpec unit tests and Beaker acceptance tests

docs/

Module documentation generated from the code

lib/

Native Ruby extensions for Puppet

README.md

Documentation describing your module and its requirements

REFERENCE.md

Documentation of the modules classes, types, functions, and so on

Gemfile

Gem dependencies for testing and extending Puppet

Rakefile

Rake tasks for validating and testing the module

metadata.json

Metadata about the module for puppet tools to use

hiera.yaml

The module data hierarchy for puppet lookup

.fixtures.yml

Dependencies for puppet apply, RSpec, and Beaker testing

There is also additional metadata for CI, editing, Git, and other tools.

Every one of these components is (technically) optional. Although each one will be automatically generated using pdk new module, empty directories should be removed to simplify the module layout and clarify the design and intent of your module. For example, if the module contains only native Puppet extensions, there’s no need to have a manifests/ directory. Likewise, if your module contains no files or templates, you should remove those directories to focus attention on what the module does provide.

You should always update the autogenerated README.md file and the metadata.json, even if only to simplify them. These are data for the reader and tools, and bad data creates big problems.

The Module’s Main Class

The Puppet manifest in manifests/init.pp contains the main class of your module. It is optional but rare for it to not exist. In most modules, init.pp is the most likely starting point to see what input the module accepts, to understand the basic layout of the module, to perform input validation, and to handle user input and transformation of input data.

As a general rule, we recommend grouping resources in classes, and performing all class inclusion and relationship handling in the main class. This approach makes the module easier to understand, and centralizes the flow and features of the module.

Although the main class manifest is often the main entry point into your module, this is a convention rather than a rule. It’s perfectly okay to have defined types or classes in your module that might be declared or referenced from outside your module.

An example main class

Let’s look at the main class for the Apache module in Example 4-1. This example is simplified for instructional use; a real Apache module would likely accept many more input parameters.

Example 4-1. Example main class for a simple Apache module
class apache (
  String[1] $ensure        = 'installed',
  String[1] $servername    = $facts['fqdn'],
  Integer[0,65535] $listen = 80, 1
  String[1] $user,
  Stdlib::Absolutepath $documentroot,  # defaults to
  String[1] $package_name,             # os-dependent data 2
  String[1] $service_name,             # sourced from Hiera
) {

  # Validation phase
  unless $package_name =~ /^[[:print:]]+$/ {
    fail("invalid package name")
  }
  unless $service_name =~ /^[[:print:]]+$/ {
    fail("invalid service name")
  }

  # servername must match apache's specifications 3
  unless $servername =~ /^([a-z]+:\/\/)?[\w\-\.]+(:[\d]+)?$/ {
    fail("servername invalid
         http://httpd.apache.org/docs/2.4/mod/core.html#servername")
  }
  # valid POSIX user name
  unless $user =~ /^[_.A-Za-z0-9][-\@_.A-Za-z0-9]*\$?$/ {
    fail("username must be POSIX compliant")
  }

  # Declaration phase
  class { 'apache::install':  4
    package => $package_name,
    ensure  => $ensure,
  }

  class { 'apache::config':
    servername   => $servername,
    documentroot => $documentroot,
    listen       => $listen,
    user         => $user,
  }

  class { 'apache::service':
    service => $service_name,
  }

  # Relationship definitions
  contain apache::install 5
  contain apache::config
  contain apache::service

  Class['apache::install'] -> Class['apache::config']  ~> Class['apache::service'] 6
}
1
Type checking input avoids the need for validation tests in the code block.
2
This assumes that the module has a hiera.yaml file specifying the module’s data hierarchy.
3

Some input validation such as printable characters are self-documenting. This regular expression is not, so it’s a good idea to add a comment regarding its purpose.

4

This demonstrates resource-style class declaration for child classes. Other approaches are discussed in “Modularizing Classes”.

5

Containment is critical so that modules are not required to create relationships with child classes. For more on this, see “Class Containment”.

6

Class relationships are dramatically simpler than a web of resource relationships. We use chaining arrows because they are easier to read at a glance.

This simplified example demonstrates a complete layout usable for larger, more complex modules:

  1. The class begins by accepting a set of input parameters. Defaults that are universal for all platforms are declared here, and platform-specific defaults are pulled from module data. Each value is explicitly checked for type and minimum size.

  2. The validation phase checks for valid values by using regexes. In older versions of Puppet there would have been dozens of validate_ function calls for type checking, but this is greatly reduced with inline input checking.

  3. Data transformation would happen after input validation. None was needed in this example, but if a value was determined by or from other values, it could be handled here.

  4. The validated input is passed to child classes in the declaration phase.

  5. The declared classes are contained using the contain() function, and ordered using chaining arrrows in the relationship phase.

Module Parameters

Modules receive external input via class parameters.1 Class parameters are a well-defined interface that permit you to supply values when declaring the class. This data can be passed directly to your module’s resources or it can alter the behavior of your module using conditional logic. The following snippet shows an example of a class with a single parameter:

class ntp (
  $servers = 'pool.ntp.org',
) {
  # Resources go here
}

If your module has special case input needs, such as to look up data using hiera_hash(), the best approach is to still define an input parameter and to set the default value of that parameter so that the lookup you wish to use is performed automatically if no value is explicitly supplied, as demonstrated here:

class ntp (
  $servers = lookup('ntp::servers', Array, 'unique', ['pool.ntp.org']), 1
) {
  # Resources go here
}
1

Note that we still supply an optional default value in our Hiera lookup to be used in case Hiera is not available.

The approach demonstrated in the previous code example has several major advantages over alternatives:

  • It automatically looks up values in Hiera.

  • It allows you to declare the class without an explicit value.

  • It facilitates debugging by embedding the result of the lookup in your catalog.

Parameter defaults

It’s a good idea to supply default values for all parameters, even if those defaults aren’t necessarily going to be useful in a real-world environment. In many cases, a default of undef is a perfectly valid and simple value.

There are two main reasons for this recommendation:

  • It simplifies experimentation with your module.

  • It avoids creating Hiera dependencies during unit testing.

There are many situations for which you might want to test or experiment with one or many modules. This is common when deciding whether a module from the Puppet Forge is suitable for a site.

In these cases, it’s ideal to be able to test the module by installing it into a temporary module path and then testing the module with the apply command, as shown in the following:

$ mkdir -p example
$ puppet module install --modulepath=example puppetlabs/ntp
Notice: Preparing to install into /home/vagrant/example ...
Notice: Downloading from https://forgeapi.puppetlabs.com ...
Notice: Installing -- do not interrupt ...
/home/vagrant/example
|--- puppetlabs-ntp (v3.3.0)
   |--- puppetlabs-stdlib (v4.24.0)
$ sudo puppet apply ./example/ntp/tests/init.pp -modulepath='./example' --noop 
Notice: Compiled catalog for localhost in environment production in 0.74 seconds
Notice: /Stage[main]/Ntp::Config/File[/etc/ntp.conf]/content: current_value
{md5}7fda24f62b1c7ae951db0f746dc6e0cc, should be
{md5}c9d83653966c1e9b8dfbca77b97ff356 (noop)
Notice: Class[Ntp::Config]: Would have triggered 'refresh' from 1 events
Notice: Class[Ntp::Service]: Would have triggered 'refresh' from 1 events
Notice: /Stage[main]/Ntp::Service/Service[ntp]/ensure: current_value stopped,
should be running (noop)
Notice: Class[Ntp::Service]: Would have triggered 'refresh' from 1 events
Notice: Stage[main]: Would have triggered 'refresh' from 2 events
Notice: Finished catalog run in 0.67 seconds

The puppetlabs/ntp module includes a test manifest and supplies reasonable defaults for all of its parameters. There is no need to read the documentation to determine what inputs are mandatory, and there’s no trial and error involved in a simple test application of the module. As a result, it’s very easy to test, evaluate, and deploy this module.

Another case for which sane defaults are useful is during testing. If your module has mandatory parameters and may be invoked via an include() call from another module, you’ve implicitly created a dependency on Hiera (or another data binding). This can complicate the setup for your test cases because your Hiera data will not be available in this context.

Parameter default complexity

Puppet allows fairly complex generation of parameter defaults. The default can be a function call, nested function, selector, or a complex data structure.

As a general recommendation, if your default value is more than one line long, you should probably move it into the module’s Hiera data, especially if it is not dynamic.

Although you can embed selectors and other complex logic in your parameter defaults, doing so makes the input section difficult to comprehend. The data in modules pattern moves complex logic outside the parameter default block without losing the benefits of parameterization. For more information, see “Hiera Data in Modules” later in this chapter.

Parameter default limitations

Until recent versions of Puppet, a parameter default could not contain the value of another parameter. To work around this, people often used confusing intermediary variables to test the value of the original variable. These were often suffixed with _real, like so:

class chroot (
  $root_dir = '/chroot',
  $prefix   = undef,
  $bindir   = undef,
) {
  # pick() returns the first defined value
  $prefix_real = pick($prefix, "${root_dir}/usr")
  $bindir_real = pick($bindir, "${prefix}/usr")
}

The module would then use the _real variable names in place of the original variable names.

This limitation has been removed in Puppet 5, and parameters can now use the value of any parameter declared before them. So you can now simplify all of the mess in the preceding example to this much more readable form:

class chroot (
  $root_dir = '/chroot',
  $prefix   = "${root_dir}/usr",
  $bindir   = "${prefix}/bin",
) { }

Parameter naming conventions

A good parameter name has a few properties:

  • It should be unambiguous.

  • It should be consistent with your style guide.

  • Its purpose should be fairly obvious.

  • It should be memorable.

A good rule of thumb is to name the parameter after whatever will consume its value. For example, if your parameter will supply a DocumentRoot value for an Apache configuration file, the most intuitive name for your parameter will be documentroot. If your parameter provides a source for a package resource, the most intuitive name will be source or source prefixed with the package name when necessary to disambiguate multiple sources.

For example, the following parameter names would be fairly obvious to someone familiar with Apache:

  • serverroot

  • documentroot

  • listen

  • servername

  • errorlog

If you’re writing a module for a relocatable service, follow the GNU coding standards and use the directory naming conventions:

  • prefix

  • bindir

  • sysconfidir

  • localstatedir

In all of these cases, we attempt to conform to pre-existing naming conventions, with minimal transformation to meet Puppet’s variable naming requirements.

When a parameter is to be passed to a puppet resource, we might simply reuse the resource’s parameter name, possibly prepended with the resource name or resource title to remove ambiguity.

  • ensure

  • source

  • package_httpd_source

Input Validation

Input validation is the act of testing input supplied to your module to ensure that it conforms to a specification and doesn’t contain any nefarious data.

The design of input validation is important to consider and can be very environment specific. It is important to assess both the goals of validating input and the risks associated with invalid input.

Data supplied to your module will usually come from a trusted source such as from Hiera data from an internal data repository. In these cases, the goal of input validation is not so much about security as it is to generate useful failure messages when the input is malformed. When the data comes from an external source, the goal is to protect against not just invalid input, but also dangerous input.

Input validation should be designed to provide useful troubleshooting information, and you should avoid overly restrictive validation. Specifically, when you’re designing input validation, be cautious that your tests don’t reject inputs that are otherwise perfectly valid, such as nonfully qualified paths when your application will happily accept them. For example, unqualified paths are valid in most parts of an Apache configuration, and are interpreted relative to the ServerRoot directive. With Puppet, you can use variable interpolation in pathnames, allowing many paths to be relative to $confdir and other base directories.

The most common source of external untrusted data are facts generated by a managed node when working in a master/agent environment. Facts are simply data provided about the node, which can be used to customize the catalog. However, there is a risk of privilege escalation if a nonprivileged user can alter the fact values supplied to the Puppet catalog, which is implemented by the privileged Puppet agent process. You can protect against privilege escalation by preventing unprivileged users from modifying the environment or configuration used to determine fact values.

When using exported resources or other forms of shared node data, facts from one compromised node can be used to attack other nodes on the network. In these cases, input should absolutely be validated to help protect nodes from one another. The best way to protect against privilege escalation in an exported resources environment is to limit sharing to strictly validated values that can be tested for sanity before use. For example, ensure that only carefully validated hostnames are used in the load-balancer configuration built from exported resources.

We’ll discuss available input validation functions and techniques in “Input Validation Functions”.

Data in the Module

Although all business and service-specific data will come from other data sources, it can be helpful to provide data within a module. For example, the names of the packages and the locations of the configuration files on different operating systems is well-known information that users of the module shouldn’t be forced to add to their data. In a moment, we will examine two popular patterns for providing data in modules.

Regardless of the source of default data for a component module, the values should be component-specific defaults, not site-specific defaults or business logic. Module data default values should be true no matter where the module is deployed. You should use parameters from your roles and profiles to override these defaults when necessary to implement site-specific needs.

The params.pp Pattern

The params.pp pattern was designed to simplify the selection of platform-specific parameter defaults by moving default values into a dedicated manifest. For those familiar with Chef, the params.pp pattern was somewhat like cookbook attributes.

Warning

This pattern was necessary for now obsolete Puppet versions, but it has been replaced by the data in modules pattern of Puppet 4 and higher.

With this pattern, the logic to select the appropriate default values was moved out of the main class to, as you might guess, the params.pp manifest. Although Hiera could provide platform-specific values to a module, until Hiera v4 there wasn’t any way for module authors to ship Hiera data inside their module. Adding component-specific data to an organization’s global data was problematic for somewhat obvious reasons. Please copy these values...

The params.pp pattern violates several best practices. The pattern relies on the otherwise discouraged inherits feature of Puppet classes in order to enforce ordering between it and your module’s main class. It also relies on fully qualified resource references to data outside the class scope. It’s difficult to read and difficult to track variables in larger modules with many child classes, as demonstrated in Example 4-2.

Example 4-2. The main class declaration for a module using the params.pp pattern
class apache (
  $ensure       = 'installed',
  $config_file  = $apache::params::config_file,
  $errorlog     = $apache::params::errorlog,
  $package      = $apache::params::package,
) inherits 'apache::params' { 1
1

inherits is mandatory. Otherwise, Puppet will throw an error complaining that we’ve referenced variables from a class that has not been evaluated. inherits ensures that the params class is evaluated before the apache class, as shown in Example 4-3.

Example 4-3. The params class declaration for a module using the params.pp pattern
class apache::params {
  case $facts['os']['family'] { 1
    'RedHat': {
      $config_file  = '/etc/httpd/conf/httpd.conf'
      $errorlog     = '/var/log/httpd/error.log'
      $package      = 'httpd'
    }
    'Debian': {
      $config_file  = '/etc/apache2/apache2.conf'
      $errorlog     = '/var/log/apache2/error.log'
      $package      = 'apache2'
    }
    default: {
      fail("${facts['os']['family']} isn't supported by ${module_name}") 2
    }
  }
}
1
This case statement declares platform-specific default values.
2
Make sure the default case is good. Execute functions during data assignment, bad!

The params.pp pattern violates the tenents of both data in code, and worse yet, code in data.

Hiera Data in Modules

The data in modules pattern (see Example 4-4) allows a module author to embed a Hiera hierarchy within the module, allowing component-specific default parameters to be stored as YAML, JSON, or HOCON data in the module. This enables the author to keep functions in code, and data in text files.

Warning

Hiera data in modules far exceeds the features of the params.pp pattern, which is now obsolete.

Example 4-4. Class declaration for a module with default values in Hiera
class apache (
  $ensure,
  $supported,
  $package,
  $config_file, 1
  $errorlog,
) {

  unless($supported) {
    fail("${facts['os']['family']} isn't supported by ${module_name}") 2
  }
  ...
1
No default data in the code. This is safe because it’s part of the module.
2
Code decides what to do based on the data.

As you can see, the manifest now contains only actionable code. There are no data values embedded in the code.

So, where do these values come from? A component module should have a small hierarchy of data files containing platform-specific and default values, as illustrated in the following:

---
version: 5
defaults:
  datadir: data         # This path is relative to the module
  data_hash: yaml_data  # Use the built-in YAML backend

# Default values
hierarchy:
- name: "OS-specific config"
  path: "%{facts.os.family}.yaml"

- name: "Defaults"
  path: "defaults.yaml"

Each of the data files shown in Examples 4-5 through 4-7 contains the exact same information in plain text.

Example 4-5. data/RedHat.yaml
---
apache::package: 'httpd'
apache::config_file: '/etc/httpd/conf/httpd.conf'
apache::errorlog: '/var/log/httpd/error.log'
apache::supported: true
Example 4-6. data/Debian.yaml
---
apache::package: 'apache2'
apache::config_file: '/etc/apache2/apache2.conf'
apache::errorlog: '/var/log/apache2/error.log'
apache::supported: true
Example 4-7. data/default.yaml
---
apache::supported: false
apache::package: 'unknown'
apache::config_file: '/etc/apache/apache.conf'
apache::errorlog: '/var/log/apache/error.log'

This code pattern provides a clear separation of code and data and is significantly easier to read and maintain than the inheritence pattern of params.pp.

Modularizing Classes

You can add classes beyond the main class, each in its own manifest file in the manifests/ directory. Beyond keeping modules focused, as discussed in “Keep the Module Focused”, it is important to keep each individual class focused, too. This section covers managing the relationships between cooperating and dependent classes.

Dependencies

Module dependencies allow you to make use of classes, defined types, functions, and resource types and providers from other modules without reinventing the wheel within your own module.

It’s very common for a module to have dependencies on other modules. For example, puppetlabs/stdlib is a nearly universal dependency due to the number of useful function calls it provides.

There are only three things necessary for safe, effective use of dependencies:

  • List the dependency in dependencies of the module metadata and test fixtures.

  • Use class relationships to indicate any ordering or notification requirements (described in the next section).

  • Identify the relationship in the module documentation (covered in “Creating Useful Documentation”).

Class Relationships

Classes organize data and establish relationships between resources. If we were to write a puppet_server module, for example, we could establish a notify relationship between the puppet.conf and auth.conf file resources and the puppet service resource, or we could place these resources into separate classeses and declare relationships between the classes.

Building resource relationships this way then becomes a huge maintainability win. If we add a new resource to our module, we simply place it in the appropriate class and Puppet automatically establishes relationships between it and the rest of our resources, as demonstrated in Example 4-8.

Example 4-8. Resource-based relationships
file { '/etc/puppetlabs/puppetserver/conf.d/webserver.conf':
  notify => Service['puppetserver'],
}

file { '/etc/puppetlabs/puppetserver/conf.d/auth.conf':
  notify => Service['puppetserver'],
}

service { 'puppetserver':
  ensure => 'running'.
  enable => true,
}

Although this is suitable for simple cases, the number of relationships can grow exponentially as related files and services are added to the module. In this case, it’s much easier to break each set of tightly related resources into a separate class and then set up the relationships between the classes, as shown in the following:

include 'puppetserver::config'
include 'puppetserver::service'

Class['puppetserver::config'] ~> Class['puppetserver::service']

Class relationships are also a huge win if we need to be able to uninstall an application. Removing an application from a node requires reversing the resource relationships so that things can be removed in the opposite order of installation. It’s much simpler to reverse a small set of class-based relationships than a large web of direct resource relationships.

You can use these example relationships as a guide for how to break your module into classes. If you find a lot of relationships between two sets of resources, consider using classes and class-based relationships instead. It is not incorrect to create a module with three child classes for three resources. You can then add new things to the module without breaking existing relationships.

Class Containment

Containment causes relationships with a parent class to flow down to the contained classes. Let’s take a moment to consider the code in Example 4-9.

Example 4-9. Class-based relationships
class java {
  package { 'openjdk':
    ensure => 'installed',
  }
}

class tomcat {
  notify { 'tomcat': }

  class { 'tomcat::package': } ~> class { 'tomcat::service': }
}

class tomcat::package {
  package { 'tomcat'
    ensure => 'installed',
  }
}

class tomcat::service {
  service { 'tomcat':
    ensure => 'running',
  }
}

include 'java'
include 'tomcat'

Class['java'] -> Class['tomcat']

In this example, the class java has a before relationship with the class tomcat and the resource Notify[tomcat]. tomcat::package and tomcat::service counter-intuitively have no relationship with their parent class, tomcat, and thus no relationship with the java class.

Although the child classes are defined in the Tomcat module, the relationships with tomcat apply only to resources declared directly in the tomcat class and not to the resources in any of its child classes. In Example 4-9, the following relationships exist:

Class['java'] -> Package['openjdk'] -> Class['tomcat']

Unfortunately, the Tomcat class contains only Notify['tomcat'], whereas the following relationships remain freestanding:

  • Class['tomcat::package'] contains Package['tomcat']

  • Class['tomcat::service'] contains Service['tomcat']

To solve this problem, we need to either anchor or contain the classes.

Containment

The contain keyword includes a class and creates a relationship between the included class and its parent class. Example 4-10 presents our previous module but this time using containment.

Example 4-10. Class-based relationships using containment
class java {
  package { 'openjdk':
    ensure => 'installed',
  }
}

class tomcat {
  notify { 'tomcat': }

  contain 'tomcat::package'
  contain 'tomcat::service'

  Class['tomcat::package'] ~> Class['tomcat::service']
}

class tomcat::package {
  package { 'tomcat'
    ensure => 'installed',
  }
}

class tomcat::service {
  service { 'tomcat':
    ensure => 'running',
  }
}

include java
include tomcat

Class['java'] -> Class['tomcat']

By using contain, Example 4-10 now has the following relationships:

Class['java'] -> Package['openjdk'] -> Class['tomcat']
-> Class['tomcat::package'] -> Package['tomcat']
~> Class['tomcat::service'] ~> Service['tomcat']

Notice that Class['tomcat'] now has a relationship with its child Class['tomcat::package'], whereas this relationship did not exist in Example 4-9

Although the contain function automatically includes the class being contained, it can be combined with resource-style class declarations, as is illustrated in Example 4-11. Doing so is parse-order dependent, so you must declare the classes before you contain them (remember: you safely include a class previously declared with parameters, but not vice versa). Regardless, this approach is currently the best-practice solution to handling containment and is the officially documented approach to building good modules, as the preceding example shows.

Example 4-11. Contain with resource-style class declarations and chaining
class { 'tomcat::package':
  ensure => $ensure,
  source => $source,
}
class { 'tomcat::service':
  ensure => $ensure,
  enable => $enable,
}

contain 'tomcat::package'
contain 'tomcat::service'

Class['tomcat::package'] ~> Class['tomcat::service']

Anchors

The anchor pattern is the original solution to the class containment problem. Anchors are a resource type provided by the puppetlabs/stdlib module. Anchors themselves perform no actions, but they do offer an anchor with which to establish class relationships inside a module. They also pass along notify signals so that notification works between modules as expected.

Warning

The contain keyword was added to Puppet in version 3.4. If you are writing modules for modern releases of Puppet, we recommend that you use the contain function in your classes rather than the anchor pattern.

Here is the tomcat class with anchors:

class tomcat {
  anchor { 'tomcat::begin': }
  -> class { 'tomcat::package': }
  ~> class { 'tomcat::service': }
  -> anchor { 'tomcat::end'}
}

Although this seems to be simpler than our containment example, it carries extra complexity to ensure that our resource relationships behave the way we expect. The tomcat class shown above contains a subtle bug: Anchor['tomcat::begin'] does not have a notify relationship with Anchor['tomcat::service']. As a result, notifications sent to the Tomcat module would not cause the Tomcat service to restart. This might be an issue if, for example, you updated Java to patch a vulnerability using Puppet, and the Tomcat service continued running under the old release of Java because its service never received a notification to restart.

Beyond that, the anchor pattern creates some ugly resource relationship graphs that can be painful to read when attempting to analyze Puppet’s graph output.

Intentionally uncontained classes

There are some cases for which you might want to intentionally avoid containing resources. Consider the following case for which we need to insert the deployment of one application after its dependent module has been installed and configured but before the service from its dependent service has been started:

include 'tomcat'
include 'my_tomcat_app'

Class['tomcat'] -> Class['my_tomcat_app'] ~> Class['tomcat::service']

This example will work only if Class['tomcat::service'] is not contained inside of Class['tomcat']. Otherwise, a dependency loop would be created, like so:

Class['tomcat::service'] -> Class['my_tomcat_app'] -> Class['tomcat::service']

Internally, the rest of the Tomcat module might have a notify relationship with Class['tomcat::service'] and a relationship loop will not be created. We could create such a module using this basic layout:

class tomcat {
  contain 'tomcat::install'
  contain 'tomcat::config'
  include 'tomcat::service'

  Class['tomcat::install'] -> Class['tomcat::config'] ~> Class['tomcat::service']
}

This works because resource relationships do not require that one resource be evaluated immediately after another when a relationship is defined. The relationship simply sets general ordering requirements and allows for other resources to be inserted into the order. Puppet maintains the relationship ordering internally using a dependency graph.

When using an approach such as this, remember that you lose the ability for the un-contained resources to generate notifications for the parent class; anything that wants to subscribe to Class['tomcat::service'] must do so explicitly now.

Because such module designs do not conform to the typical design pattern for a module, it’s critical to test and document this special behavior and to treat the un-contained class as an interface into the module; not to be changed without planning for the break in compatibility.

Intermodule relationships like this are typically domain-logic handled in your profiles rather than hardcoded into your modules.

This approach is useful when complex relationships are necessary between arbitrary modules. In many cases, it’s much better to use a defined type as an interface, as discussed in “Providing Clean Service Interfaces with Defined Types”.

Interfacing with Classes

There are three popular ways to pass data from the module’s main class to its other classes:

  • Use a resource-style class declaration with parameters.

  • Declare classes using contain or include, with fully qualified variable references inside the class.

  • Use class inheritance.

Let’s consider what makes each of these solutions best for a given situation.

Passing data via parameterized class declarations

This is the preferred approach for passing data from a main class to another class. This approach, shown in Example 4-12, makes variable handling extremely explicit and causes immediate failures in the event of a typo or hasty change.

Example 4-12. Passing data using parameterized class and resource-style class declaration
class apache (
  $ensure  => 'installed',
  $package => $apache::params::package,
) {
  class { 'apache::install':
    ensure  => $ensure,
    package => $package,
  }
  contain 'apache::install'
}

class apache::install (
  $ensure,
  $package,
) {
  package { $package:
    ensure => $ensure,
  }
}

With this approach, if either the package or ensure parameter is missing or if there is a typo, the declaration of apache::install will throw an error. If the apache::install class does not accept a parameter that’s passed to it from the main class, an error is also thrown.

This pattern is best-practice for several reasons:

  • This is the most readable pattern. The main class knows which variables are used by child classes.

  • Explicit parameter passing allows the main or child class to be refactored without breaking the other.

  • This pattern can allow direct access to child classes from the outside, if desired.

  • It can provide consistent variable names in the public interface, while refactoring at will internally.

The only downside of this class is the combination of a class declaration and a following contain function call.

Passing data via fully qualified variable references

With this approach, the apache::install class receives input directly from internals of the main class, as shown in Example 4-13.

Example 4-13. Passing data using fully qualified variables
class apache (
  $ensure  => 'installed',
  $package => $apache::params::package,
) {
  contain 'apache::install'
}

class apache::install (
  $ensure  = $apache::ensure,
  $package = $apache::package,
) {
  include stdlib
  assert_private("I'm an internal child class, don't offer me candy!")

  package { $package:
    ensure => $ensure,
  }
}

The benefit of this approach is that inclusion of the apache::install class and its containment are handled with a simple contain statement. The disadvantage of this approach is that the use of fully qualified variables means that the child class must be accessed from the parent class (enforced in the example with the assert_private function). In addition, the consumption of values by the child class is not visible to someone refactoring the main class. This somewhat mitigates the use of the main class to view the flow of a module.

Passing data via class inheritence

You can also pass variables using class inheritance. With this approach, shown in Example 4-14, variables local to the main class are available in the local scope of any class that inherits from the main class.

Example 4-14. Passing data using class inheritance
class apache (
  $ensure  => 'installed',
  $package => $apache::params::package,
) {
  contain 'apache::install'
}

class apache::install inherits apache {
  package { $package:
    ensure => $ensure,
  }
}

This approach carries the least initial development overhead, but it is the least readable and most fragile implementation. Variables are declared in the main class, and are automatically made available to any of the other classes as if they were declared locally. This effectively reimplements variable inheritance from the Puppet 2.x days within the child classes.

The major disadvantage of this approach is readability:

  • You can’t see the child class’ usage from the main class.

  • Any refactoring of the variables in the main class will likely break the child classes.

  • It’s easy to end up with complex inheritance trees that make it difficult to identify where a variable came from.

Even though this pattern is widely used, we recommend avoiding it if other choices would work, due to its impact on readability and the fragility of the module.

Reusing Defined Types

A defined type is a Puppet manifest that can be called repeatedly with different data to create unique resources of the type, similar to built-in or custom resource types. For example, you can create as many user resources as you like, as long as they have unique titles. Defined types operate in the same manner.

Defined types are commonly used in one of four ways:

  • To create a list of resources from a list of input values

  • To create interfaces in a module

  • As a service to the outside world

  • As the core purpose of a module

Providing Clean Service Interfaces with Defined Types

There are situations that will require complex relationships between modules. The principles of interface-driven design (discussed in “Interface-Driven Design”) advise against accessing internal data structures of other modules directly.

Defined types give a module the simplest interface for use in other modules. They read in the code exactly the same as any other resource type and automatically order the dependencies without explicit relationship markers. This makes them a preferable approach for handling intermodule dependencies.

Defined types provide a clean parameterized interface that you can define and test, allowing the internal structure of the module to change without breaking dependent modules.

In “Interface-Driven Design”, we provide the following example of the use of a defined type as an interface into an Apache module:

class myapp {
  include 'apache'

  apache::vhost { 'myapp':
    documentroot => '/var/www/myapp',
  }
}

In this example, the class myapp is interfacing with the Apache module using the apache::vhost defined type. It doesn’t really matter what the internal structure of that defined type is; it manages the internals of that itself.

For example, that module might use resource ordering to ensure the virtual nodes requested are built in the right order, like so:

  Class['apache::config'] -> Apache::vhost[$title] ~> Class['apache::service']

These relationships and references to the internal structure of the Apache module are contained within the defined type. The Apache module could be completely rewritten, and so long as the defined type exists and continues to provide the servername and documentroot parameters, the myapp class will continue to work.

By using a defined type as an interface into the module and having regression tests for the interface, we can safely rewrite or refactor our module without even knowing how others are using it. This flexibility is the huge benefit of interface-driven design. Without it, changes to one module often break code elsewhere.

In many cases your module might be able to provide a useful feature to the outside world via a defined type. The Apache module we’ve been looking at throughout this section is a great example of this—it provides a defined type that handles the OS-specific implementation details of configuring a virtual node such that our module will apply properly on any OS the Apache module supports.

This defined type doesn’t prevent other modules from implementing their own vhost templates, but it does offer a simple way to define a virtual node that’s integrated nicely into our module, and handles most common use cases.

The interface provided by the defined type is much better than creating virtual nodes in file resources because the defined type can establish relationships into the Apache module and access data from inside that module without violating the principles of SoC and interface-driven design.

Simplify Complex Operations with a Defined Type

There are often cases for which you can use a defined type to provide a clean and simple interface for complex operations.

Example 4-15 presents a defined type for managing network service names. It uses Augeas to manage /etc/services and a defined type to provide a clean interface around the Augeas resource.

Example 4-15. An Augeas resource wrapped in a defined type
define network::service_name (
  $port,
  $protocol      = 'tcp',
  $service_name  = $title,
  $comment       = $title,
) {
  $changes = [
    "set service-name[last()+1] ${service_name}",
    "set service-name[last()]/port ${port}",
    "set service-name[last()]/protocol ${protocol}",
    "set service-name[last()]/#comment ${comment}",
  ]

  $match = "service-name[port = '${port}'][protocol = '${protocol}']"
  $onlyif = "match ${match} size == 0"

  augeas { "service-${service_name}-${port}-${protocol}":
    lens    => 'Services.lns',
    incl    => '/etc/services',
    changes => $changes,
    onlyif  => $onlyif,
  }
}

The defined type in Example 4-15 can declare service names without forcing the user to learn the Augeas syntax.

This approach helps a mixed team take advantage of experience. An experienced module creator or subject matter expert can create a defined type to manage a complex task. Less senior team members use the user-friendly resource declarations like any other Puppet resource, as shown here:

network:service_name { 'example':
  port => '12345',
  protocol => 'tcp',
}

Defined types of this nature can be so useful that a module might contain nothing more than defined type manifests.

Interacting with Other Resouces in the Module

When a defined type is included in a component module, it often needs to interact with the rest of the module.

Creating relationships with resources in the module

Because each defined type is a new instance, unknowable in advance, the best place to put relationships between the defined type and other classes and resources inside of your module is inside the defined type.

Warning

Don’t use collectors to indiscriminately set relationships against all instances of your defined type. Use tags on the defined type to avoid realizing virtual instances, as discussed in “Dangling relationships to unrealized resources causes breakage”.

Including other classes

It’s best to keep your defined types as small and self-contained as possible. You should avoid resource-style declarations of classes or referencing out-of-scope variables.

In general, including any resource-style declaration with parameters will create parse-order-dependent catalog build problems that sometimes work until they suddenly don’t. The problem symptoms will seem to defy the code as written, which make them very difficult to debug.

Warning

A class is declared (and parsed) only once. It is a singleton that exists in shared memory. A defined type is parsed for each and every declaration of it. If you’re not scared of the parse order madness this creates, just wait.

Imagine a defined type that includes a class. Both of them have three resources. The defined type is declared five times. How many resources exist when you’re done? 15? 30? How about neither? Let’s examine this:

  1. Defined type declared name example1 declares three resources.

  2. Defined type declares a class that declares three resources.

  3. Defined type declared name example2 declares three resources.

  4. Defined type declares a class that has already been declared; no new resources.

  5. Defined type declared name example3 declares three resources.

  6. Defined type declares a class that has already been declared; no new resources.

  7. Defined type declared name example4 declares three resources.

  8. Defined type declares a class that has already been declared; no new resources.

  9. Defined type declared name example5 declares three resources.

  10. Defined type declares a class that has already been declared; no new resources.

If you add them up there’s 18 total resources. Now this a simplied example, but it highlights the differences between these two. Example 4-16 makes another problem clear.

Example 4-16. What value does this class receive?
defined example::defined_type (
  String $foo 1
) {
  class { 'example':
    value => $foo, 2
  }
}
1

This parameter changes with every call to the defined type.

2

The class accepts this parameter only on the first declaration. Any further declaration with a value will cause a catalog failure.

This pattern will work in any catalog with a single declaration of the defined type, but fail when a second declaration is added. The only safe class declaration inside a defined type is a parameterless include of the class. Defined types can safely include only classes that need no values from the defined type. Anything more complex is likely to create a parse-order problem that you can’t read from the code.

This pattern works for one purpose only: to ensure access to resources and variables declared in a parameterless dependency class. This allows users of the defined type to declare it without having included the defined type’s dependencies first.

Creating Useful Documentation

Documentation is an investment in the future. When you are in the middle of writing a module, the behavior of the module, its inputs and its quirks are self-evident. When you’ve moved on, it’s easy to forget what each parameter does, what kind of input the parameters accept, and what quirks exist in your module.

Documentation of a module takes three forms:

  • Inline documentation within manifests and other code

  • Markdown documentation in the module’s root

  • description and other fields in the module metadata

The module skeleton provided by pdk new module includes useful examples of all three to kickstart your documentation.

README, REFERENCE, and Other Markdown

GitHub, GitLab, BitBucket, and other source-code repositories provide automatic rendering of Markdown documentation, making documentation committed to your software repository one of the most user-friendly ways of documenting the module.

It’s a good idea to include a README.md file containing the following information:

  • The name of the module

  • Any dependencies your module might have, either internal or external to Puppet

  • Example invocations of the module for common use cases

  • Notes about bugs or known issues (if you are not using a public issue-tracker)

  • Contact information

Usage examples

You should show some common usage examples of your modules. This can help the user test the module, provide ideas as to how to use your module, and help the user if they become stuck deploying the module, perhaps due to syntax errors in their input.

Usage documentation is also useful if you wish to highlight the most commonly set or modified parameters for your module. It’s very common for a module to have several dozen parameters; this is the place your user will look for the most important parameters.

Dependencies

If your module has any dependencies or requires any setup, it’s a good idea to provide an example of how to satisfy those dependencies, as well. You should do the following with dependencies:

  • List them in the dependencies value of the metadata.json file.

  • Add them to the .fixtures.yml file for automatic provisioning by tests.

  • Document them in the README.md file.

  • Use them in the samples placed in examples/ directory.

The README should contain helpful instructions for how to satisfy the dependencies. If your dependencies are generic (this module requires a web server,) it’s a good idea to mention the dependencies and show how a commonly available module can be used to satisfy the dependency. If the dependencies are very specific, noting the modules required, providing a link to those modules, and noting version compatibility is extremely valuable.

License information

If you plan to publicly release your module, it’s a good idea to attach a license to it.

Enterprise users might have constraints placed on the code that can be deployed to their site. A license for the module might be required by their internal policies in order to permit use of the module. Some companies restrict what software licenses are acceptable for use internally, and might not be able to use code under more restrictive licenses such as GPL 3.

If you are writing modules for a business, you might want to clarify with your management or legal team regarding the license or restrictions that should be placed on the code. Some companies are fairly generous with their modules, releasing them publicly after they are scrutinized and sanitized. Others businesses will prefer to keep internally developed Puppet modules proprietary. You will save a lot of headache by making note of these constraints in your documentation.

If you plan to release the module to the public, a license is an essential way of communicating what other users can do with your code. Even if you are giving the code away for free, the license is what makes this possible. Use the helpful license selector at https://choosealicense.com/ to find the most appropriate license for your needs.

REFERENCE Markdown

The REFERENCE.md file should contain full documentation for the classes, types, providers, facts, and functions provided by the module.

Parameter documentation

Documenting your input parameters is key to writing a usable module. Parameter names are often terse, tend to be numerous, and can often be confusing. When documenting an input parameter, it’s good to provide the following information:

  • The name of the parameter.

  • A brief description of the parameters purpose.

  • The types of data accepted.

  • The structure of the data if structured data is accepted.

  • The default value of the parameter.

  • Any constraints on the data enforced by the application, module design, or input validation.

For example:

# @param document_root
#  The path to your site's document root. Defaults to '/var/www/html'

Document every input parameter. Internal or deprecated parameters can be marked with appropriate tags.

Inline documentation

Inline documentation is embedded into the code of your module. A number of Puppet tools including the puppet doc command (up through Puppet 3), the puppet strings command (Puppet 4+), and Puppet plugins for editors and IDEs can consume and display this documentation. A benefit of inline documentation is that the PDK and other tools can search for the documents in your module path, and IDEs can display the documentation alongside your code.

In most cases, the inline documentation can be reused to generate your Markdown documentation. puppet strings generates the REFERENCE.md entirely from inline documentation in your module. You can find some documentation for doing this well at https://puppet.com/docs/puppet/latest/puppet_strings.html.

Summary

In this chapter, you learned how to write clean modules that conform to the best practices discussed in Chapter 3.

Here are the main takeaways from this chapter:

  • Install the PDK to install and test modules.

  • Use an editor or IDE with syntax highlighting for Ruby and Puppet.

  • Carefully design your module to limit its scope.

  • Structure your module to make it easy to understand and simple to deploy.

  • Design interfaces into your module and use them.

  • Document your module for future maintenance.

Last of all, you should always test your modules. We’ve created an in-depth guide for testing modules in “Testing”.

1 This statement seems obvious now, but in the past it was common to use global variables to pass data into modules.

Get Puppet Best Practices now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.