Apache Blockbox Stats Generator
Contents
Overview
Apache has its native mod_status status page that many people use to pull stats into tools such as Cacti and other RRDTool based stats packages. This works well but does not always provide enough details, questions such as these remain unanswered:
- How many of my requests are GET and how many are POST?
How many 404 errors and 5xx errors do I get on my site as a whole and for script.php specifically?
What is the average response time for the whole server, and for script.php?
How many Closed, Keep Alive and Aborted connections do I have?
To answer this I wrote a script that keeps a running track of your Apache process, it has many fine grained controls that let you fine tune exactly what to keep stats on. I got the initial idea from an old ONLamp article titled Profiling LAMP Applications with Apache's Blackbox Logs.
The article proposes a custom log format that provides the equivelant to an airplanes blackbox, a flight recorder that records more detail per request than the usual common log formats do. I suggest you read the article for background information. The article though stops short of a full data parser so I wrote one for a client who kindly agreed that I can opensource it.
Using this and some glue in my Cacti I now have graphs showing a profile of the requests I receive for the whole site, but as you'll see you can apply fine grained controls to select what exactly you'll see:
The script creates on a regular interval a file that contains the performance data, the data is presented in variable=value data pairs, I will soon provide a Cacti and Nagios plugin to parse this output to ease integration into these tools.
The performance data includes values such as amount of requests, total size of requests separated by in and out bytes, average response time, total response time. Counts of connections in Close, Keep Alive and Aborted states. Counts for each valid HTTP Status code, and aggregates for 1xx, 2xx, 3xx, 4xx and 5xx. The amount of GET and POST requests. And detail for each and every unique request the server serves. See the Sample Stats for a good example, variables are pretty self explanatory. To keep the data set small and manageable 2 selectors exist, one to choose which requests to keep details for and which to keep stats for. These can be combined with standard Apache directives such as Location to provide very fine grained stats for all or a subset of your site.
The script has been tested on some pretty busy sites, using the Pipe method it happily processes 100 requests / second while showing 0% CPU usage in top.
The script can be downloaded here. Please leave any comments and suggestions on my blog
Requirements
For all execution methods you need:
- Apache or a webserver that can output the correct log format, tested with Apache 2.2.3
- A recent Perl 5, tested on Perl 5.8.8
If you want to parse a log that exist on your file system you need:
If you want to see the included help page by passing --help to the script you will need Pod::Usage, alternatively just run perldoc apacheblackbox.pl or view the perldoc output here
Command Line Arguments
--output |
The output file to create |
--detail |
Enable the storing of full details for each unique request |
--skip |
Regular expression for requests that will not count to any stats |
--show |
Regular expression for requests that will have full detail |
--fileonly |
Strip parameters from requests for storing detail |
--logfile |
Use a logfile instead of STDIN |
--help |
The full perldoc output |
--output
The file to write the stats too, ownership of this file will be the UID of the main Apache process
--detail
When enabled a detailed block of stats will be kept for each url accessed, 2 other options modify the behaviour of this see --show and --fileonly
--skip
This takes a regular expression of pages that stats should not be kept for. You can use this to keep stats for everything except for example images by using something like '\.jpg|\.gif$'. Requests matching this regular expression will not count towards any stats, not detailed stats and not the totals for the server.
--show
If --detail is enabled all unique requests will get a block of stats individual to that request, this will often result in a large list of unexpected statistics being kept for no good reason, you can restrict it to only keeping statistics for files ending in .php by using a regular expression like '\.php$'. With this in place you will only get performance stats for your PHP scripts.
--fileonly
If --detail is enabled statistics for each unique request will be kept, determining uniqueness will include the parameters passed to the script as part of the URL, often this is not desirable using this option the parameters gets discarded and all requests - regardless of parameters - will be considered as one requests and thus count only to one detailed stats block.
--logfile
By default the intention is to use CustomLog in Apache and pipe the log lines to this script, this is the most robust option as Apache will start the parser should it exit etc, in some cases it might be desirable to just parse a log file instead. Supplying the B<--logfile> option with a path to a file will enable this.
--help
Full perldoc output
Installation
Installing the script can be complex especially if you run it as a pipe since Apache will not always give you details of what goes wrong, it will simply try and restart the script. While debugging check that the PID of the script as started by Apache does not change rapidly, if it does, check the Apache error_log for some helpful information.
To get information about bytes in and out you need to have mod_logio.c loaded into your Apache, this can be done by including the following into your main Apache config.
LoadModule logio_module modules/mod_logio.so
With this enabled you can now configure a custom log format to write compatible log entries to this script, you can do this inside a VirtualHost, various other places where CustomLog is valid in an apache config file:
<IfModule mod_logio.c>
CustomLog "| /path/to/apacheblackbox.pl --output /var/www/blackbox.txt --detail" "%a %X %t \"%r\" %s/%>s %{pid}P/%{tid}P %T/%D %I/%O/%B"
</IfModule>
As you can see any of the above options simply get passed to the script from the Apache CustomLog lines, you can further select only certain requests to be passed to this log using the normal apache methods such as B<Location> blocks etc:
<Location /cgi-bin>
SetEnv blackboxlog 1
</Location>
<IfModule mod_logio.c>
CustomLog "| /path/to/apacheblackbox.pl --output /var/www/blackbox.txt --detail --show \\'script.pl\\'" "%a %X %t \"%r\" %s/%>s %{pid}P/%{tid}P %T/%D %I/%O/%B" env=blackboxlog
</IfModule>
Combining the Apache control blocks with the options to this script such as --skip and --show gives you very fine grained control over what stats to record and can be used using the same basic script on many virtual hosts and sets of files on the same machine concurrently.
This example also shows the correct escaping for the regular expression options which requires quotes around them, especially when you use | characters in your regular expressions.
If you need to consider logs from several virtual hosts as one or simply wish to also store the log on your disk you can instruct the script to read the logfile from disk using the --logfile option. You would then make a CustomLog that save to a file and just run in the parser, when called with the --logfile option it will daemonize to the background. While we've not had stability problems with the script you may want to monitor it is running either through process list, via the parser.generatedtime variable or the age of the output file.
The first thing the script does when it starts up is print out to the output file a bit of debugging information, it should look something like this:
parser.uptime=1320 parser.generatedtime=1204724920 parser.settings.skip= parser.settings.show=robots.txt|test-rss2.0.xml|atom.xml|rss2.0.xml|favicon.ico parser.settings.fileonly=1 parser.settings.output=/www/www.site.com/html/blackbox.txt parser.settings.fequency=60 parser.settings.detail=1
This is useful for verifying especially that your regex passed to --show and --skip has made it through the various layers correctly, the escaping can be weird some times, see the example above for a working one.
References
Changes
2008-03-05 |
Initial Release |
