The Risks of Infrastructure as Code

Although the potential benefits of Infrastructure as Code are hard to overstate, it must be pointed out that this approach is not without its dangers. Production infrastructures that handle high-traffic websites are hugely complicated. Consider, for example, the mix of technologies involved in a large Drupal installation. We might easily have multiple caching strategies, a full-text indexer, a sharded database, and a load-balanced set of webservers. That’s a significant number of moving parts for the engineer to manage and understand.

It should come as no surprise that the attempt to codify complex infrastructures is a challenging task. As I visit clients embracing the approaches outlined in this chapter, I see a lot of problems emerging as they start to put these kind of ideas into practice.

Here are a few symptoms:

  • Sprawling masses of Puppet or Chef code.

  • Duplication, contradiction, and a lack of clear understanding of what it all does.

  • Fear of change: a sense that we dare not meddle with the manifests or recipes, because we’re not entirely certain how the system will behave.

  • Bespoke software that started off well-engineered and thoroughly tested, but now littered with TODOs, FIXMEs, and quick hacks.

  • A sense that, despite the lofty goal of capturing the expertise required to understand an infrastructure in the code itself, if one or two key people were to leave, the organization or team would be in trouble.

These issues have their roots in the failure to acknowledge and respond to a simple but powerful side effect of treating our Infrastructure as Code. If our environments are effectively software projects, then it’s incumbent upon us to make sure we’re applying the lessons learned by the software development world in the last ten years, as they have strived to produce high quality, maintainable, and reliable software. It’s also incumbent upon us to think critically about some of the practices and principles that have been effective there, and start to introduce our own practices that embrace the same interests and objectives. Unfortunately, many of the embracers of Infrastructure as Code have had insufficient exposure to or experience with these ideas.

I have argued elsewhere[5] that there are six areas where we need to focus our attention to ensure that our infrastructure code is developed with the same degree of thoroughness and professionalism as our application code:

Design

Our infrastructure code should seek to be simple, iterative, and we should avoid feature creep.

Collective ownership

All members of the team should be involved in the design and writing of infrastructure code and, wherever possible, code should be written in pairs.

Code review

The team should be set up in such a way as to both pair frequently and to see regular notifications of when changes are made.

Code standards

Infrastructure code should follow the same community standards as the Ruby world; where standards and patterns have grown up around the configuration management framework, these should be adhered to.

Refactoring

This should happen at the point of need, as part of the iterative and collaborative process of developing infrastructure code; however, it’s difficult to do this without a safety net in the form of thorough test coverage of one’s code.

Testing

Systems should be in place to ensure that one’s code produces the environment needed, and to ensure that our changes have not caused side effects that alter other aspects of the infrastructure.

The first five areas can be implemented with very little technology, and with good leadership. However the final area—that of testing infrastructure—is a difficult endeavor. As such, it is the subject of this book—a manifesto for bravely rethinking how we develop infrastructure code.

Get Test-Driven Infrastructure with Chef now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.