O'Reilly logo

Web Performance Tuning, 2nd Edition by Patrick Killelea

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Monitoring Machine Utilization with rstat

rstat is an RPC client program I wrote to get and print statistics from any machine running the rpc.rstatd daemon, its server-side counterpart. The rpc.rstad daemon has been used for many years by tools such as Sun’s perfmeter and the rup command. The rstat program is simply a new client for an old daemon. The fact that the rpc.rstatd daemon is already installed and running on most Solaris and Linux machines is a huge advantage over other tools that require the installation of custom agents.

My rstat client compiles and runs on Solaris and Linux as well and can get statistics from any machine running a current rpc.rstatd daemon, such as Solaris, Linux, AIX, and OpenBSD. The rpc.rstatd daemon is started from /etc/inetd.conf on Solaris. I will probably also port the rstat client to other platforms. It is similar to vmstat, but has some advantages over vmstat:

  • You can get statistics without logging in to the remote machine, including over the Internet.

  • It includes a timestamp.

  • The output can be plotted directly by gnuplot.

The fact that it runs remotely means that you can use a single central machine to monitor the performance of many remote machines. It also has a disadvantage in that it does not give the useful scan rate measurement of memory shortage, the sr column in vmstat. rstat will not work across most firewalls because it relies on port 111, the RPC port, which is usually blocked by firewalls.

You can download rstat from http://patrick.net/software/rstat/rstat.html. As mentioned earlier, Sun’s perfmeter program is also an rpc.rstatd client and can also log remote server statistics to a file. However, I haven’t managed to run perfmeter without its GUI, though it could perhaps be done using Xvfb, the X virtual frame buffer.

To use rstat, simply give it the name or IP address of the machine you wish to monitor. Remember that rpc.rstatd must be running on that machine. The rup command is extremely useful here because with no arguments, it simply prints out a list of all machines on the local network that are running the rstatd demon. If a machine is not listed, you may have to start rstatd manually. To start rpc.rstatd under Red Hat Linux, run /etc/rc.d/init.d/rstatd start as root. On Solaris, first try running the rstat client because inetd is often already configured to automatically start rpc.rstatd on request. If it the client fails with the error “RPC: Program not registered,” make sure you have this line in your /etc/inet/inetd.conf and kill -HUP your inetd process to get it to re-read inetd.conf, as follows:

rstatd/2-4 tli rpc/datagram_v wait root /usr/lib/netsvc/rstat/rpc.rstatd rpc.rstatd

Then you can monitor that machine like this:

% rstat enkidu 
2001 07 10 10 36 08  0   0   0 100    0    27   54     1     0    0   12  0.1

This command will give you a one-second average and then it will exit. If you want to continuously monitor, give an interval in seconds on the command line. Here’s an example of one line of output every two seconds:

% rstat enkidu 2 
2001 07 10 10 36 28  0   0   1  98    0     0    7     2     0    0   61  0.0 
2001 07 10 10 36 30  0   0   0 100    0     0    0     2     0    0   15  0.0 
2001 07 10 10 36 32  0   0   0 100    0     0    0     2     0    0   15  0.0 
2001 07 10 10 36 34  0   0   0 100    0     5   10     2     0    0   19  0.0 
2001 07 10 10 36 36  0   0   0 100    0     0   46     2     0    0  108  0.0 
^C

To get a usage message, the output format, the version number, and where to go for updates, just type rstat with no parameters:

% rstat
usage: rstat machine [interval]
output:
yyyy mm dd hh mm ss usr wio sys idl pgin pgout intr ipkts opkts coll  cs load
docs and src at http://patrick.net/software/rstat/rstat.html

Notice that the column headings line up with the output data.

The output may look meaningless to the uninitiated, but it is quite useful and the format was chosen to be easily plotted by the gnuplot graphing program. You may download gnuplot from http://www.gnuplot.org/ or http://patrick.net/software/. You can ask gnuplot to choose any of the fields for plotting. To create a graph of your rstat data, redirect or save the rstat output data in a file, which I’ve named rstat.out here. Then create the following gnuplot configuration file, which we name enkidu.gp. Then just run gnuplot enkidu.gp and gnuplot will create a PNG file called enkidu.png that is suitable for display on a web site:

set term png color 
set output "enkidu.png" 
set xdata time 
set timefmt "%Y %m %d %H %M %S" 
set bmargin 3 
set y2label "load" 
set ylabel "context switching" 
set ytics nomirror 
set y2tics nomirror 
plot "rstat.out" using 1:17 axes x1y1 title "context switching", \
     "rstat.out" using 1:18 axes x1y2 title "load"

Figure 4-2 shows an example GIF depicting context switching and load (the 17th and 18th fields) that I created with rstat and gnuplot.

Graph of rstat data

Figure 4-2. Graph of rstat data

Storing rstat Data in a Relational Database

As with latency data, it is good to store rstat data in a database for later retrieval and correlation with problems. Here’s a SQL command that can be used to create a table for rstat data in Oracle:

create table rstat (
    machine     varchar2(20),
    timestamp   date not null,
    usr         number(3),
    wio         number(3),
    sys         number(3),
    idl         number(3),
    pgin        number(6),
    pgout       number(6),
    intr        number(6),
    ipkts       number(6),
    opkts       number(6),
    coll        number(6),
    cs          number(8),
    load        number(3,1)
    )/

And here’s some example Perl code to run rstat, parse out the fields, and perform the database insertion:

#!/usr/local/bin/perl
use DBI;
$machine = "vatche";
$interval = 60;
$dbh     = DBI->connect("dbi:Oracle:perf", "patrick", "passwd")
               or die "Can't connect to Oracle: $DBI::errstr\n";

open(RSTAT, "rstat $machine $interval |") || die "could not start rstat";

while(<RSTAT>) {
       ($yyyy, $mon, $dd, $hh, $mm, $ss, $usr, $wio, $sys, $idl, $pgin, $pgout, 
$intr,
       $ipkts, $opkts, $coll, $cs, $load) = split(/\s+/);
       $sth = $dbh->prepare("insert into rstat values
                 ('$machine', to_date('$yyyy $mon $dd $hh $mm $ss', 'YYYY MM DD HH24 
MI SS'),
                 $usr, $wio, $sys, $idl, $pgin, $pgout, $intr, $ipkts, $opkts, $coll, 
$cs, $load)");
       $sth->execute(  );
}

# If rstat dies, at least we should try to disconnect nicely.
$dbh->disconnect or warn "Disconnect failed: $DBI::errstr\n";

Using rstat Data

Now that you have system data in your database, how do you use it? The answer is any way that you use any other kind of relational data. Let’s say you want to get the average of the system and user CPU usage for October 8, 2001, between the hours of 9 a.m. and 4 p.m. Here is a query that does this:

select avg(sys + usr) from rstat where timestamp between
to_date('2001 10 08 09', 'YYYY MM DD HH24') and
to_date('2001 10 08 16', 'YYYY MM DD HH24') and
machine='mars';

Getting Data from the Database to Standard Output

Often it is useful to be able to grep, sort, or otherwise process database data on the Unix command line, but most SQL querying tools are “captive user interfaces” that do not play nicely with Unix standard in and standard out. Here is a simple Perl script that will let you do that if you have the DBI module installed and are using Oracle. It’s called sql.pl and is available at http://patrick.net/software/.

#!/usr/local/bin/perl

use DBI;

$ENV{ORACLE_HOME} = "/path/to/ORACLE/product";

$dbh = DBI->connect("dbi:Oracle:myinstance", "mylogin", "mypassword")
or die "Can't connect to Oracle: $DBI::errstr\n";

$sql = $ARGV[0];
$sth = $dbh->prepare($sql);
$sth->execute(  );

while(@row = $sth->fetchrow_array) {
   print "@row\n";
}

$sth->finish(  );
$dbh->disconnect or warn "Disconnect failed: $DBI::errstr\n";

Generating Graphs Directly from an rstat Database

Getting a single number is interesting, but it is more interesting to see how data varies over time. The following is the HTML for a CGI that will allow anyone to view graphs of your collected rstat data. It will graph a single parameter over time. You will need to alter it to reflect your ORACLE_HOME, your Oracle user and password, and your machine names, but aside from that, it should be ready to run. You can download this script from http://patrick.net/software/graph.cgi.

#!/usr/local/bin/perl
#Author: Patrick Killelea
#Date:  12 April 2001

# You will need to replace "myinstance", "mylogin", and "mypassword" below with
# values for your own environment.

use DBI;

$ENV{ORACLE_HOME} = "/opt/ORACLE/product";

print qq|Content-type: text/html\n\n|;

print qq|<HTML><HEAD><TITLE>generate a graph</TITLE>
<meta http-equiv = "Pragma" Content = "no-cache">
<meta http-equiv = "Expires" Content = "Thu, Jan 1 1970 12:00:00 GMT">
</HEAD><BODY><H1>generate a graph</H1>|;

if ($ENV{'REQUEST_METHOD'} eq 'POST') {
   read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'});
   @pairs = split(/&/, $buffer);
   foreach $pair (@pairs) {
       ($name, $value) = split(/=/, $pair);

       $value  =~ tr/+/ /;
       $value  =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;
       $contents{$name} = $value;
   }
}

$machine  = $contents{"machine"};
$parameter = $contents{"parameter"};
$daterange = $contents{"daterange"};

if ($machine && $parameter && daterange) {

   `/bin/rm tmp/*.gif`;

   $dbh = DBI->connect("dbi:Oracle:myinstance", "mylogin", "mypassword")
        or die "Can't connect to Oracle: $DBI::errstr\n";

   $sql  = "select to_char(timestamp, 'YYYY MM DD HH24 MI'), $parameter from rstat ";

   if ($daterange eq "today") {
       $sql .= "where timestamp between trunc(sysdate) and sysdate and 
machine='$machine' ";
   }

   if ($daterange eq "yesterday") {
       $sql .= "where timestamp between trunc(sysdate) - 1 and trunc(sysdate) and 
machine='$machine'";
   }

   if ($daterange eq "t-7") {
       $sql .= "where timestamp between trunc(sysdate) - 7 and sysdate and 
machine='$machine'";
   }

   if ($daterange eq "t-30") {
       $sql .= "where timestamp between trunc(sysdate) - 30 and sysdate and 
machine='$machine'";
   }

   if ($daterange eq "t-365") {
       $sql .= "where timestamp between trunc(sysdate) - 365 and sysdate and 
machine='$machine'";
   }

   $sth = $dbh->prepare($sql);
   $sth->execute(  ) || print $dbh->errstr;

   ($timestamp, $item) = $sth->fetchrow_array; # get one sample row to be sure we 
have data

   if ($timestamp) {

       $date = `date`;
       chop $date;

       open(GP, "|/usr/local/bin/gnuplot");
       print GP $gp_cmd;
       print GP qq|
           set xdata time
           set timefmt "%Y %m %d %H %M"
           set term gif
           set xlabel "graph made on $date"
           set bmargin 4
           set ylabel "$parameter"
           set output "tmp/$$.gif"
           plot '-' using 1:6 title "$parameter on $machine" with lines lt 2
       |;

       while(($timestamp, $item) = $sth->fetchrow_array) {
           print GP "$timestamp $item\n";
       }

       print GP "e\n";
       close(GP);

       $sth->finish(  );
       $dbh->disconnect or warn "acsiweba disconnect failed: $DBI::errstr\n";

       print "<p><img src=\"tmp/$$.gif\"><p>";
       print "graph was generated from this query:<p> $sql\n";
   }
   else {
       print "Sorry, I do not have the requested data for that time range for 
$machine.";
   }
}

print qq|
<FORM
METHOD="POST" ENCTYPE="application/x-www-form-urlencoded">
<P>select a machine
<SELECT NAME="machine">
<OPTION SELECTED VALUE="venus">venus database
<OPTION VALUE="mars">mars backup database
<OPTION VALUE="pluto">pluto middleware
<OPTION VALUE="saturn">saturn nfs
<OPTION VALUE="earth">earth middleware
</SELECT>
</P>
<P>select a parameter
<SELECT NAME="parameter">
<OPTION SELECTED VALUE="usr">user cpu
<OPTION VALUE="wio">wait io cpu
<OPTION VALUE="sys">system cpu
<OPTION VALUE="idl">idle cpu
<OPTION VALUE="pgin">pgs in per second
<OPTION VALUE="pgout">pgs out per second
<OPTION VALUE="intr">interrupts per second
<OPTION VALUE="ipkts">network in pkts per second
<OPTION VALUE="opkts">network out pkts per second
<OPTION VALUE="coll">collisions per second
<OPTION VALUE="cs">context switches per second
<OPTION VALUE="load">load: procs waiting to run
</SELECT>
</P>
<P>select a date range
<SELECT NAME="daterange">
<OPTION SELECTED VALUE="today">today
<OPTION VALUE="yesterday">yesterday
<OPTION VALUE="t-7">last 7 days
<OPTION VALUE="t-30">last 30 days
<OPTION VALUE="t-365">last 365 days
</SELECT>
</P>
<P>
<INPUT TYPE="submit" NAME="graph" VALUE="graph">
</P><HR><P>|;

print qq|</FORM>Questions? Write
<A HREF="mailto:p\@patrick.net">p\@patrick.net</A>
</BODY></HTML>|;

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required