Chapter 6. Monitoring

Patrick Debois

Story: "The Start of a Journey"

As a sys admin, I became fascinated by the powers of the Internet, and with it the power of the Web. Very soon, I started running my own website on one of the university computers. It was a platform for sharing links. I devoured every single computer book and magazine, went to all the user groups and conferences I could, and constantly surfed the Web for tips on how to structure my website. The site became popular, and some students started to rely on it for their daily work.

After running the site for some time, there was a major power outage on campus, and the web server didn't come back up. I got flooded with emails from students asking me to fix the problem. I quickly restarted the server and went back to tuning it and adding content. This was the first downtime the website experienced.

One morning, the website was down again and I found my mailbox full of emails. Dutifully, I restarted the server, but a few hours later the website was down again. I knew The Problem wasn't due to a power outage, as everything else was still online. And it wasn't due to the network, as my pings were showing no problems.

To automate the check to see if the website was down, I wrote a little script that did a simple HTTP request, and if the website didn't respond, it would send me an email. I was no longer relying on users to detect the problem. Most of the time I would be notified and was able to restart the website before any users ...

Get Web Operations now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.