Here at Safari Books Online, we are building an increasingly robust infrastructure, and we have made automation a top priority so that we spend our productive time working on productive projects instead of fighting fires. As we grow to support new and different types of web-based apps and clusters, not to mention desktop workstations and development VMs, it’s been important to manage our infrastructure using configuration management software. For this job, we’ve chosen Chef open source server to keep our systems tuned. We’ve taken special interest in building tools to provide an operational awareness that will enable us to quickly and easily ensure that services are behaving as expected. Over in the IT operations team, we had been previously sold on the benefits of Behavior- and Test-Driven Development by our friends in Engineering, and in recent months, we have begun to include mechanisms for testing our code changes in the configuration management software we use throughout our infrastructure.
We operate with the general idea that if you don’t test for a condition and you don’t have an alert for it, you can’t reasonably assume you know it’s working properly. Moreover, in the spirit of just doing the Right Thing just once (in other words, DRY), another benefit of infrastructure software testing is that after you’ve defined your needs and written your tests, you have a a pretty good idea of what kinds of alerts you need to create in your monitoring software. For Chef, developers have created tools for both unit tests and integration tests. We looked at a number of them, scouring the Food Fight Show podcast archives for any references to testing and using narrow search terms to dredge up various Opscode mailing list threads about testing from the past year or so.
At first, we experimented with chefspec and, in fact, wanted to use it and write primarily unit tests for local testing because it seemed like the most lightweight manner in which to test whether a cookbook behaves as expected. It was simple to get up and running, but we quickly encountered pain when it came to stubs and mocks. Apparently, like the other testing tools, chefspec is still under active development and may eventually find its way into our toolbox, but around the time we were having difficulty integrating it into our workflow, one of our developers had the opportunity to attend DevOps Days in New York and spoke with Sean OMeara from Opscode who demonstrated minitest-chef-handler, which painlessly facilitates integration tests. In fact, the adoption of minitest-chef-handler led us to a broader revelation about how we should expect to use Chef.
As we experimented with the different testing and workflow tools, we were, maybe without even realizing it, looking for The Right Way. Indeed, our bias towards chefspec at first was rooted in the idea that we would follow BDD/TDD philosophy closely and write the tests before the cookbooks, which is theoretically easier with unit tests, which are roughly simulated chef runs and typically more abstract than the integration tests. Upon coming to know and love minitest-chef-handler so easily, along with the minitest-handler cookbook, it became apparent to us that to get the end results we want — in early 2013, at least — is to accept that Chef and its periphery are in a period of rapid development, particularly when it comes to issues related to workflow and testing tools. As a result of this, we decided to stop worrying and learn to love transient infrastructure tooling.
Currently, we have a workflow in which the master of each cookbook’s repo can always be deployed, and we run integration tests on a staging node before pushing out to prod. We have a few safeguards in place to prevent bad cookbooks from making it out to production, but we certainly don’t have fully automated infrastructure at this point. We do, however, have a workflow in place that supports multiple developers and gives them somewhere to test their cookbooks other than production, and they can do so without undue psychological overhead. Moreover, to get to where we are now, we had to get detailed about how we want to interact with Chef and what we want it to do, which necessarily led us to determine how we want to comfortably build out Chef while guaranteeing our infrastructure.
In case it seems coy to obliquely describe our environment and development pipeline without details, it’s specifically to avoid muddying the somewhat confusing landscape of Chef testing software. Indeed, “Chef” as a metaphor holds up well because we have learned so far that it helps to know how to cook with the various ingredients in order to determine which you want in your entrees. While we were seeking tools, we doubtlessly benefited from many tools discussions within the Chef user community, but even today, we can say that we already have our eyes towards the adoption of other testing tools in the near-ish future (Test Kitchen is coming together and integrates with Berkshelf). We wanted to share our story, though, so that we might provide hope to other Chefs by saying that we have implemented an imperfect system and are embracing this experience as real-world iterative development.