Apache logs: simple log analyzer in Perl

Get Apache to log your bandwidth by putting in httpd.conf/apache2.conf;

LogFormat “%V %h %l %u %t “%r” %>s %b “%{Referer}i” “%{User-agent}i” %I %O” combined_io
CustomLog /var/log/httpd/access_log.users combined_io

And run the following perl script;


#!/usr/bin/perl

# get handler to the logfile
open(LOG_FILE, "tail -f /var/log/httpd/access_log.users|");

# start the main loop
%users = ();
$cp=0;
while() {
chomp;

$cp = time() + 5*60 if $cp==0; # set to refresh the traffic.log file every 5 min

my $t1 = time(); # timestamp for checkpointing

my ($dom, $ip, $date, $req, $code, $ref, $client, $in, $out)
= /(.*?) (.*?) .*?[(.*?)] "(.*?)" (.*?) .*?"(.*?)" "(.*?)" (.*?) (.*)/;

my ($method, $uri, $proto) = ($req =~ /(.*?) (.*?) (.*)/);

my $get = 0;
if ($uri =~ /(.*?)?(.*)$/) {
$uri = $1;
$get = $2;
}

my $url = $dom;

if (!$users{$url}) {
$users{$url} = $in + $out;
} else {
$users{$url} = $users{$url} + $in + $out;
}

$cpc = time();
if ( $cpc > $cp ) {
$cp=0;
my $s = "";
foreach(keys %users) {
$s.=($users{$_}/1024)." kb - ".$_."n";
}
`echo "$s"|sort -n -r > /tmp/traffic.log`;
}
}


# never get here:
close LOG_FILE;

By viewing /tmp/traffic.log, you’ll see what/who is using most traffic.

Be the first to leave a comment. Don’t be shy.

Join the Discussion

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>