O'Reilly logo

Web Scraping with Python by Ryan Mitchell

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 13. Testing Your Website with Scrapers

When working with web projects that have a large development stack, it’s often only the “back” of the stack that ever gets tested regularly. Most programming languages today (including Python) have some type of test framework, but the front end of websites are often left out of these automated tests, although they might be the only customer-facing part of the project. 

Part of the problem is that websites are often a mishmash of many markup languages and programming languages. You can write unit tests for sections of your JavaScript, but it’s useless if the HTML it’s interacting with has changed in such away that the JavaScript doesn’t have the intended action on the page, even if it’s working correctly.

The problem of front-end website testing has often been left as an afterthought, or delegated to lower-level programmers armed with, at most, a checklist and a bug tracker. However, with just a little more up-front effort, we can replace this checklist with a series of unit tests, and replace human eyes with a web scraper. 

Imagine: test-driven development for web development. Daily tests to make sure all parts of the web interface are functioning as expected. A suite of tests run every time someone adds a new website feature, or changes the position of an element. In this chapter, we’ll cover the basics of testing and how to test all sorts of websites, from simple to complicated, with Python-based web scrapers.

An Introduction to ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required