Converting IP Addresses

Before we jump into log-file analysis, let’s return briefly to the problem of doing hostname lookups on the IP addresses that most likely comprise the “host” entries in our web access logs. Example 8-1 gives a script, clf_lookup.plx , that does just that. (Like all the examples in this book, it is available for download from the book’s web site, at http://www.elanus.net/book/.)

Example 8-1. A script to do hostname lookups on IP addresses in web access logs

#!/usr/bin/perl -w

# clf_lookup.plx

# given common or extended-format web logs on STDIN, outputs 
# them with numeric IP addresses in the first (host) field converted 
# to hostnames (where possible).

use strict;
use Socket;

my %hostname;

while (<>) {
    my $line = $_;
    my($host, $rest) = split / /, $line, 2;
    if ($host =~ /^\d+\.\d+\.\d+\.\d+$/) {
        # looks vaguely like an IP address
        unless (exists $hostname{$host}) {
            # no key, so haven't processed this IP before
            $hostname{$host} = gethostbyaddr(inet_aton($host), AF_INET);
        }
        if ($hostname{$host}) {
            # only processes IPs with successful lookups
            $line = "$hostname{$host} $rest";
        }
    }
    print $line;
}

The script itself is pretty simple, but it introduces some new concepts that are definitely worth learning about. The first new thing is this line:

use Socket;

Here we are importing a module called Socket.pm . Just as we did earlier, when we pulled in the CGI.pm module, we’re doing this in order to let some more experienced programmers do our dirty work for us. Specifically, ...

Get Perl for Web Site Management now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.