O'Reilly logo

Web Performance Tuning, 2nd Edition by Patrick Killelea

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Monitoring Web Performance Using Perl

We can easily expand on the earlier Perl example shown previously to create a useful monitoring system. This section shows how I set up an automated system to monitor web performance using Perl and gnuplot.

There are some commercial tools that can drive a browser, which are useful in cases, but they have many drawbacks. They usually require you to learn a proprietary scripting language. They are usually Windows-only programs, so they are hard to run from a command line. This also means you generally cannot run them through a firewall, or from a Unix cron job. They are hard to scale up to become load tests because they drive individual browsers, meaning you have to load the whole browser on a PC for each test client. Most do not display their results on the Web. Finally, they are very expensive. A Perl and gnuplot solution overcomes all these problems.

Perl was chosen over Java partly because of its superior string-handling abilities and partly because of the nifty LWP library, but mostly because there free SSL implementations for Perl exist. When I starting monitoring, there were no free SSL libraries in Java, though at least one free Java implementation is now available.

Plotting Results with Gnuplot

gnuplot , from http://www.gnuplot.org/ (no relation to the GNU project), was chosen for plotting because you can generate Portable Network Graphics (PNG) images from its command line. The availability of the http://www.gnuplot.org/ site has been poor recently, but I keep a copy of gnuplot for Linux on my web site http://patrick.net/software/. There is a mirror of the gnuplot web site at http://www.ucc.ie/gnuplot/.

At first I used Tom Boutell’s GIF library linked to gnuplot to generate GIF images, but Tom has withdrawn the GIF library from public circulation, presumably because of an intellectual property dispute with Unisys, which has a patent on the compression scheme used in the GIF format. PNG format works just as well as GIF and has no such problems, though older browsers may not understand the PNG format. The gd program, also from Tom Boutell, and its Perl adaptation by Lincoln Stein, are probably just as suitable for generating graphs on the fly as is gnuplot, but I haven’t tried them.

gnuplot takes commands from standard input, or from a configuration file, and plots in many formats. I show an example gnuplot configuration file with the Perl example below. You can start gnuplot and just type help for a pretty good explanation of its functions, or just read the gnuplot web site at http://www.gnuplot.org/. I’ve also made multiple GIF images into animations using the free gifsicle tool. Another easy way to make animations is with the X-based animate command. I’m still looking for a portable open source and open-standard way to pop up graph coordinates, select parts of images, zoom, flip, stretch, and edit images directly on a web page; if you hear of such a thing, please write .

An Example Monitoring Script in Perl

It is easy to grab a web page in Perl using the LWP library. The harder parts are dealing with proxies, handling cookies, handling SSL, and handling login forms. The following script can do all of those things. Here’s the basic code for getting the home page, logging in, logging out, and graphing all the times. I try to run my monitoring and load testing from a machine that sits on the same LAN as the web server. This way, I know that network latency is not the bottleneck and I have plenty of network capacity to run big load tests.

#!/usr/local/bin/perl -w

use LWP::UserAgent;
use Crypt::SSLeay;
use HTTP::Cookies;
use HTTP::Headers;
use HTTP::Request;
use HTTP::Response;
use Time::HiRes 'time','sleep';

# constants:
$DEBUG = 0;
$browser = 'Mozilla/4.04 [en] (X11; I; Patrix 0.0.0 i586)';
$rooturl = 'https://patrick.net';
$user = "pk";
$password = "pw";
$gnuplot = "/usr/local/bin/gnuplot";

# global objects:
$cookie_jar = HTTP::Cookies->new;
$ua = LWP::UserAgent->new;

MAIN: {
 $ua->agent($browser); # This sets browser for all uses of $ua.
 # home page
 $latency = &get("/home.html");
 $latency = -1 unless index "<title>login page</title>" > -1; 
 # verify that we got the page
 &log("home.log", $latency);
 sleep 2;

 $content = "user=$user&passwd=$password";

 # log in
 $latency = &post("/login.cgi", $content);
 $latency = -1 unless m|<title>welcome</title>|;
 &log("login.log", $latency);
 sleep 2;

 # content page
 $latency = &get("/content.html");
 $latency = -1 unless m|<title>the goodies</title>|;
 &log("content.log", $latency);
 sleep 2;

 # logout
 $latency = &get("/logout.cgi");
 $latency = -1 unless m|<title>bye</title>|;
 &log("logout.log", $latency);

 # plot it all
 `$gnuplot /home/httpd/public_html/demo.gp`;
}

sub get {
 local ($path) = @_;

 $request = new HTTP::Request('GET', "$rooturl$path");

 # If we have a previous response, put its cookies in the new request.
 if ($response) {
     $cookie_jar->extract_cookies($response);
     $cookie_jar->add_cookie_header($request);
 }

 if ($DEBUG) {
     print $request->as_string(  );
 }

 # Do it.
 $start = time(  );
 $response = $ua->request($request);
 $end = time(  );
 $latency = $end - $start;

 if (!$response->is_success) {
     print $request->as_string(  ), " failed: ", $response->error_as_HTML;
 }

 if ($DEBUG) {
     print "\n################################ Got $path and result was:\n";
     print $response->content;
     print "################################ $path took $latency seconds.\n";
 }

 $latency;
}

sub post {
 local ($path, $content) = @_;

 $header = new HTTP::Headers;
 $header->content_type('application/x-www-form-urlencoded');
 $header->content_length(length($content));

 $request = new HTTP::Request('POST',
                              "$rooturl$path",
                              $header,
                              $content);

 # If we have a previous response, put its cookies in the new request.
 if ($response) {
     $cookie_jar->extract_cookies($response);
     $cookie_jar->add_cookie_header($request);
 }

 if ($DEBUG) {
     print $request->as_string(  );
 }

 # Do it.
 $start = time(  );
 $response = $ua->request($request);
 $end = time(  );
 $latency = $end - $start;

 if (!$response->is_success) {
     print $request->as_string(  ), " failed: ", $response->error_as_HTML;
 }

 if ($DEBUG) {
     print "\n################################## Got $path and result was:\n";
     print $response->content;
     print "################################## $path took $latency seconds.\n";
 }

 $latency;
}

# Write log entry in format that gnuplot can use to create an image.
sub log {
 local ($file, $latency) = @_;
 $date = `date +'%Y %m %d %H %M %S'`;
 chop $date;
 # Corresponding to gnuplot command: set timefmt "%y %m %d %H %M %S"

 open(FH, ">>$file") || die "Could not open $file\n";
 
 # Format printing so that we get only 4 decimal places.
 printf FH "%s %2.4f\n", $date, $latency;

 close(FH);
}

This gives us a set of log files with timestamps and latency readings. To generate a graph from that, we need a gnuplot configuration file. Here’s the gnuplot configuration file for plotting the home page times:

set term png color
set output "/home/httpd/public_html/demo.png"
set xdata time
set ylabel "latency in seconds"
set bmargin 3
set logscale y
set timefmt "%Y %m %d %H %M %S"
plot "demo.log" using 1:7 title "time to retrieve home page"

Note that I set the output to write a PNG image directly into my web server’s public_hml directory. This way, I merely click on a bookmark in my browser to see the output. Now I just set up a cron job to run my script every minute and I have a log of my web page’s performance and a constantly updated graph.

Use crontab -e to modify your crontab file. Here’s an example entry in my crontab file. (If you’re not familiar with Unix cron jobs, enter man crontab for more information).

# MIN   HOUR   DOM    MOY    DOW   Commands
#(0-59) (0-23) (1-31) (1-12) (0-6) (Note: 0=Sun)
*       *      *      *      *     cd /home/httpd/public_html; ./monitor.pl

Figure 4-1 shows example output image from a real site I monitored for over a year.

Graph of monitored site

Figure 4-1. Graph of monitored site

One small problem with this approach is clear if you repeatedly get the same page and look closely at the timings. The first time you get a page it takes about 200 milliseconds longer than each subsequent access using the same Perl process. I attribute this to Perl’s need to create the appropriate objects to hold the request and response. Once it has done that, it doesn’t need to do it again.

Instead of running from cron, you can turn your monitoring script into a functional test by popping up each page in a Netscape browser as you get it, so you can see monitoring as it happens, and also visually verify that pages are correct in addition to checking for a particular string on the page in Perl. For example, from within Perl, you can pop up the http://patrick.net/ page in Netscape like this:

system "netscape -remote 'openURL(http://patrick.net)'";

You can redirect the browser display to any Unix machine running the X Window System, or any Microsoft Windows machine using an X Windows server emulator like Exceed. Controlling Netscape from a script is described at http://home.netscape.com/newsref/std/x-remote.html.

The Pieces

Here is a listing of all the pieces you need to use Perl to monitor your web site. It takes work to get and compile each piece, but once you have them, you have enormous power to write many kinds of monitoring and load testing scripts. I know that compiling in the following order works, but some other orders might work as well. Except for gcc and Perl, these pieces are all available on my web site at http://patrick.net/software/. Perl is available from http://www.perl.com/ and gcc is available from ftp://prep.ai.mit.edu/ as well as many other sites around the world.

gcc 
perl 5.004_04 or better 
openssl-0.9.4
Crypt-SSLeay-0.15
Time-HiRes-01.20
MIME-Base64-2.11
URI-1.03
HTML-Parser-2.23
libnet-1.0606 
Digest::MD5-2.07 
libwww.perl-5.44 
gnuplot

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required