A Single-Search Results Tool
To give you an idea of how the
WWW::Search
module works, we’ll start with a
simple script that runs from the command line. This script will let
us use arguments to specify which search engine to search, what query
to submit to it, and so on. The script, called
search_rank.plx
, is in Example 18-1.
Example 18-1. Querying search engines from the command line with WWW::Search
#!/usr/bin/perl -w # search_rank.plx # using the WWW::Search module, compute the rank of the highest-ranked # page for a particular site when searching a particular search # engine for a particular query string. use strict; use WWW::Search; use Getopt::Std; my %opt; getopts('s:u:q:m:', \%opt); unless ($opt{s} and $opt{u} and $opt{q}) { die <<"EOF"; Usage: $0 [options] Required options: -s search_engine -u base_url -q 'search query' Optional options: -m max_#_to_retrieve (defaults to 50) EOF } my $max = $opt{m} || 50; my $search = new WWW::Search($opt{s}); $search->maximum_to_retrieve($max); my $base_url = quotemeta($opt{u}); my $rank = 0; my $count = 1; $search->native_query(WWW::Search::escape_query($opt{q})); while (my $result = $search->next_result( )) { if (not $rank and $result->url =~ /$base_url/o) { $rank = $count; } print "$count: ", $result->title || $result->url, ', ', $result->url, "\n"; ++$count; } print "Rank: $rank\n";
Using the Getopt::Std Module
As you
scan
through this script, the first interesting
thing you’ll notice is the use of the
Getopt::Std
module. This is a standard ...
Get Perl for Web Site Management now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.