Regular Expressions of Particular Interest to Webbot Developers

Now that you have a basic concept of how regular expressions are used, let’s look at how regular expressions can make your job as a webbot developer a little easier.

Parsing Phone Numbers

Let’s assume that you need to write a parsing program that retrieves all the phone numbers on a web page. The first step is to think about the formats that phone numbers may take. This sounds easy, but you might want to consider these questions:

  • Do you want to include toll-free phone numbers?

  • Are you targeting phone numbers from a particular country?

  • What do you do with multiple copies of the same phone number?

  • Do you want to include country codes?

  • How do you want to deal with alpha characters in phone ...

Get Webbots, Spiders, and Screen Scrapers, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.