Apache2 and Visitor

From SaruWiki
Revision as of 09:27, 29 December 2009 by Saruman! (talk | contribs) (added location note)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Visitor is an Apache2 log analyzer that can show which people visited your website. I found a simple cron-based setup here. The setup for our Debian system goes like this:

First install ip2host and visitor using the well understood

sudo apt-get install ip2host visitor

It will install the two requested packages, and probably also some extra packages like graphviz and ttf-liberation.

Next we create a cache directory for ip2host

sudo mkdir /var/cache/ip2host

Then we create the following script, which we save in /etc/cron.daily/visitors: <source lang="bash">

  1. !/bin/bash

MYIP="192.168" # i want to exclude my home network from the logs SERVERIP="222.222.222.222" # my server's ip REPORTDIR="/var/www/webstats" # folder where to store reports, this folder must exist ALOGDIR="/var/log/apache2" # folder containing the logs VISITORS="/usr/bin/visitors -A --exclude wp-cron.php --exclude robots.txt" # i exclude some files from the reports IP2H="ip2host --cache=/var/cache/ip2host/cache.db" GREPOPTIONS="-hv -e ^$MYIP -e ^$SERVERIP" # exclude my home network and my server's ip from the logs

  1. we create a tmp file that will hold the logs

TMPFILE=$(mktemp) if ! -f "$TMPFILE" ; then

 echo "tmpfile doesn't exist."
 exit 1

fi

  1. if you only have one site, or you want all the logs in a single report
  2. /bin/grep $GREPOPTIONS $ALOGDIR/access*.log{.1,} 2>/dev/null > $TMPFILE # get all the logs into the tmpfile, notice the GREPOPTIONS variable.
  3. resolve all the ips and generate the reports, note that "--trails --prefix http://www.domain.com" is optional it's only needed for generating trails stats
  4. ($IP2H < $TMPFILE ) | $VISITORS --trails --prefix http://www.domain.com - > $REPORTDIR/stats.html
  1. -OR-
  2. if you have multiple vhosts/prefixes and want separate reports, you can use this:
  3. replace all the "saruman.biz mediawikifarm.nl" by your own prefixes (as they appear in the apache.conf for each vhost)

for name in saruman.biz mediawikifarm.nl; do

 /bin/grep $GREPOPTIONS $ALOGDIR/$name-access.log{.1,} 2>/dev/null > $TMPFILE
 ($IP2H < $TMPFILE ) | $VISITORS --trails --prefix http://$name - > $REPORTDIR/stats-$name.html

done

rm -f $TMPFILE </source>

Note that the resulting files get stored in /var/www/webstats and are readable for world. You could thus simply create a symlink somewhere in the directory tree of your website that links to these statistics (provided your apache configuration follows symlinks). Note that you might want to protect your statistics in some way, like with .htaccess.