Perl programmer for hire: download my resume (PDF).
John Bokma's Hacking & Hiking

Generating an HTML report with GoAccess

August 28, 2016

This month I started blogging again after a hiatus of over 4 years. Curious if my site was already getting more traffic, and to which pages, I wanted to make a report of my sites' access log.

I had already some experience with AWStats a free log file analyzer, but was looking for something simpler. After some Google research I came upon GoAccess - Visual Web Log Analyzer, which I installed on my virtual Ubuntu 15.10 as follows:

sudo apt-get install goaccess

I had already skimmed the HTML Reports section of How To Install and Use GoAccess Web Log Analyzer with Apache on Debian 7 | DigitalOcean so I thought the following would generate the desired HTML report:

goaccess -f johnbokma-com-access.log -a >report.html

But this reported the following fatal error:

GoAccess - version 0.8.3 - Oct 24 2014 17:27:51

Fatal error has occurred
Error occured at: parser.c - parse_log - 1053
No date format was found on your conf file.

After some struggling, including find to find the default configuration; no luck, downloading the latest version of the GoAccess source and copying its configuration -- which failed because it supports more options -- I created a goaccess.conf file with the following options:

date-format %d/%b/%Y
log-format %h %^[%d:%t %^] "%r" %s %b "%R" "%u"

Running the following command resulted in the desired HTML report:

goaccess -p goaccess.conf -f johnbokma-com-access.log -a > report.html

Upon viewing the HTML report I noticed two issues. First, two IPs had been hitting my site very often and after some research I decided to use iptables to block the whole range both belong to.

The second issue was that the program I wrote yesterday to generate the RSS feed for this site used incorrect links, access to which showed up in the "HTTP 404 Not Found URLs" pane of the HTML report. A small modification to the Perl program I use to generate this blog fixed the broken links in the RSS feed.