Improvements Since Then

We still have the same basic system in place today. However, we have made a few enhancements.

One of the key enhancements we made is that the population of the read-only database is now managed by an event-driven system which uses Gearman, a distributed job queue system, instead of on a schedule using scripts that were run via cron. This allows content to be pushed to the frontend as soon as it is ready. At first, we feared this would create an unpredictable load on our primary database. But it turned out that the opposite was true. The scheduled data migration was causing surges in database work. The server would be idle for minutes and then flooded with work to perform all at once. Likewise, the servers that moved the data would see similar peaks and valleys in their usage. If we experienced any unexpected event during these peaks, the database server would become unresponsive. Now the load is spread out based on the speed at which our content team creates new content.

Another enhancement we made is that now we operate two datacenters. The second datacenter has a full set of frontend proxy, application, and read-only database servers. In addition, the database is replicated over VPN from our primary datacenter. We use geo-IP-based DNS to send users to their closest datacenter. Plus, we have the ability to shift all requests to one datacenter after a short DNS timeout, allowing us to conduct major maintenance on a location and still serve content to our readers. ...

Get Web Operations now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.