Create a comma-delimited file from a list of phone numbers returned by Google.
Just because Googleâs API doesnât support the phonebook: [Hack #17] syntax doesnât mean that you canât make use of Google phonebook data.
This simple Perl script takes a page of Google
phonebook:
results and produces a comma-delimited
text file suitable for import into Excel or your average database
application. The script doesnât use the Google API,
though, because the API doesnât yet support
phonebook lookups. Instead, youâll need to run the
search in your trusty web browser and save the results to your
computerâs hard drive as an HTML file. Point the
script at the HTML file and itâll do
its thing.
Which results should you save? You have two choices depending on which syntax youâre using:
If youâre using the
phonebook:
syntax, save the second page of results, reached by clicking the âMore business listings...â or âMore residential listings...â links on the initial results page.If youâre using the
bphonebook:
orrphonebook:
syntax, simply save the first page of results. Depending on how many pages of results you have, you might have to run the program several times.
Because this program is so simple, you might be tempted to plug this code into a program that uses LWP::Simple to automatically grab result pages from Google, automating the entire process. You should know that accessing Google with automated queries outside of the Google API is against their Terms of Service.
#!/usr/bin/perl # phonebook2csv # Google Phonebook results in CSV suitable for import into Excel # Usage: perl phonebook2csv.pl < results.html > results.csv # CSV header print qq{"name","phone number","address"\n}; my @listings = split /<hr size=1>/, join '', <>; foreach (@listings[1..($#listings-1)]) { s!\n!!g; # drop spurious newlines s!<.+?>!!g; # drop all HTML tags s!"!""!g; # double escape " marks print '"' . join('","', (split /\s+-\s+/)[0..2]) . "\"\n"; }
Run the script from the command line, specifying the phonebook
results HTML filename and name of the CSV file you wish to create or
to which you wish to append additional results. For example, using
results.html
as our input and
results.csv
as our output:
$ perl phonebook2csv.pl < results.html > results.csv
Leaving off the >
and CSV filename sends the
results to the screen for your perusal:
$ perl phonebook2csv.pl < results.html
"name","phone number","address"
"John Doe","(555) 555-5555","Wandering, TX 98765"
"Jane Doe","(555) 555-5555","Horsing Around, MT 90909"
"John and Jane Doe","(555) 555-5555","Somewhere, CA 92929"
"John Q. Doe","(555) 555-5555","Freezing, NE 91919"
"Jane J. Doe","(555) 555-5555","1 Sunnyside Street, "Tanning, FL 90210""
"John Doe, Jr.","(555) 555-5555","Beverly Hills, CA 90210"
"John Doe","(555) 555-5555","1 Lost St., Yonkers, NY 91234"
"John Doe","(555) 555-5555","1 Doe Street, Doe, OR 99999"
"John Doe","(555) 555-5555","Beverly Hills, CA 90210"
Using a double >>
before the CSV filename
appends the current set of results to the CSV file, creating it if it
doesnât already exist. This is useful for combining
more than one set of results, represented by more than one saved
results page:
$ perl phonebook2csv.pl < results_1.html > results.csv $ perl phonebook2csv.pl < results_2.html >> results.csv
Get Google Hacks now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.