Linux Loginfo - Apache basic log analysis software

22 October 2009

This Apache log analysis tool is no longer being maintained. This is included here for future reference or to allow anyone to pick up and continue with the project.

About

This is simple Log Analysis software designed to give the webmaster useful information about
who is visiting their website. The software has only a few pre-requisites that should be met
by most websites already. Once installed it allows viewing of the reports through a web browser.

Whilst designed and tested for a linux system, it should work on any Apache / Perl / PHP webserver, although
it will need to be installed in different directories on a windows system.

The software is released under the GPL as free open source software.

Manual / Example

Full details can be seen by viewing the:

Manual
Sample Report (Version 0.1.0)

Download

Download the file in tgz (tar gzipped) format:

loginfo-0.1.2.tgz (beta)

Updates in latest version

Version 0.1.2 beta

Fix bug in hourly display

Version 0.1.1 beta

Better browser and OS detection
More Webbots identified
Reporting of Status Codes
Summaries ordered by relevance
Improved Error Reporting
Bug fixes / additional testing

Patches / Config Updates

The latest Robots Entry can be used to update the line in your loginfo configuration file (loginfo.cfg).

The entry should be copied as a single line.

1st July 2005 (as included in version 0.1.1)

our @robots = ('Googlebot', 'Yahoo! Slurp', 'Netcraft Web Server Survey', 'Ask Jeeves/Teoma', 'grub', 'msnbot', 'Wget', 'Feedster Crawler', 'BlogSearch', 'Syndic8', 'Cerberian', 'WISEnutbot', 'BlogPulse', 'Technoratibot', 'A2B Location-Based Search Engine', 'BlogsNowBot', 'Blogslive', 'Blogshares', 'UniversalFeedParser', 'ping.blo.gs', 'PageBitesHyperBot', 'PubSub-RSS', 'SurveyBot', 'walhello', 'Mirar', 'OmniExplorer', 'W3C_Validator', 'IconSurf', 'TurnitinBot', 'psbot', 'aipbot', 'StumbleUpon', 'Gigabot', 'LinkWalker', 'rojo.com', 'ConveraCrawler', 'DiamondBot', 'HenryTheMiragoRobot', 'Baiduspider', 'WebFilter Robot', 'SURF', 'topicblogs', 'BecomeBot' );

17th June 2005

9th June 2005 (as included in version 0.1.0)

Old Versions

The following old versions are no longer being developed. You should move to the latest version.

loginfo-0.1.0.tgz (alpha)

Loginfo Manual

Simple Web Log Analysis

User Manual

About

LogInfo provides a way to analyse the Apache web logs. It focuses on the information that is useful to

a webmaster trying to improve the appeal of their website, identifying the most popular areas of the site

as well as the way they are referred to the site.

The problem with much of the log analysis software is that they are either too complex, or don't include

the features needed to analyse script based websites, or in many cases both. My primary aim was that the

program should quick to get up and running and simple to use. Another important feature is the ability to

handle scripts that use session information without getting a entry for every single url variation. It was

this feature that made me write this software rather than using httpdstats which was one of the tools I tried

before writing my own.

This program was created to provide me with some useful information from my own webserver logs.

It has been made public through the GPL and time permitting I will continue to develop this. If you would

rather develop your own program based on my original code then the GPL will allow you to do this as long as

the resulting software is also released through the GPL. If you do follow that route, please rename the program

and let me know to avoid confusion between the original and new software. Alternatively if you would like to

contribute to the development of this program please send an email with any suggestions. For example if

you have some code that could provide better browser detection then please provide me with details. All

submitted code should be provided either free of any copyright or using the GPL / LGPL licenses. If included

in the software it will be issued with the GPL license. Please email loginfo@watkissonline.co.uk.

Features

Easy to install (few pre-requisites)
Easily automated
Easy to navigate reports
Filters out the useful information
Works with standard apache log format (combined)
Handles Virtual Sites (with modification to Apache Logging)
Removes the session information from specified scripts
Can be run as often or infrequent as you want (depending upon apache log settings
Reports viewable from the hosting webpages (optional)
Simple Authentication (using Apache)
Text Version also available (with limited formatting)

History

08/06/2005 Version 0.1 Alpha (Limited testing)
01/07/2005 Version 0.1.1 (Beta)

License

The software is licensed under the GPL. Full details are provided in the text file gpl.txt which

should be distributed with this software.


LogInfo Apachelog Log Analysis Tool

Web: http://www.watkissonline.co.uk 
Copyright (C) 2005  Stewart Watkiss 



This program is free software; you can redistribute it and/or

modify it under the terms of the GNU General Public License

as published by the Free Software Foundation; either version 2

of the License, or (at your option) any later version.



This program is distributed in the hope that it will be useful,

but WITHOUT ANY WARRANTY; without even the implied warranty of

MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the

GNU General Public License for more details.



You should have received a copy of the GNU General Public License

along with this program; if not, write to the Free Software

Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.

Future Developments

Future versions may include:

XML based output
Template based formatting

There is however no expected date for an updated version, which at the time of writing has not been started on.

Bugs & Limitations

The program must be run to complete before log rotation starts
Report output based on time since last log rotation (recommend monthly log rotation)
Text Only version users basic formatting
Some browsers not supported

Installation

Please read these installation instructions fully before installing the

software. Ensure that you have read and understood the security implications

of adding the software to cron before you install the software and before creating

a scheduled job.

These instructions are based on a webserver running GNU/Linux. The files may need to be

installed manually on other systems.

Pre-Requisites

The following pre-requisites are needed. They will be installed by default on most systems.

Apache
Perl (Tested with Version 5)
Time::ParseDate Module (download from www.cpan.org if required)
PHP (for web menu only)
Linux / UNIX system (may work on other systems, but installation instructions may not work)

Time::ParseDate Module

If the Time::ParseDate module is not installed on your system it can be installed as follows:

perl -MCPAN -e shell

install Time::ParseDate

quit

If you get the following error when you try and run the program then you may need to follow these instructions:

Can't locate Time/ParseDate.pm in @INC

Upgrading from an Earlier Version

Upgrading to version 0.1.1 is achieved by extracting the files into the same directory as the previous install (e.g. /usr/local/loginfo). The PHP file has not been updated so there is no need to copy that across. Existing .cfg files can be used but it is recommended that new configuration files are created from the new sample.cfg which includes new features. You may need to run the chown / chmod commands to set the permissions correctly.

Downloading the File

The latest version of the code is available at: www.watkissonline.co.uk. Other sites may allow you to download the software, but you should check it is the most recent version.

Installing the files

There is no automated installer at present. The installation is just a few simple steps that can be tailored to your own needs.

Extract the files to a convenient directory using the command:
```
tar -xvzf loginfo-x.x.x.tgz
```
(replace x.x.x with the version number of the software).

Create a directory to store the program code e.g. /usr/local/loginfo (superuser access may be needed)
```
mkdir /usr/local/loginfo
```
If you change the directory name then you will have to edit the apachelog.pl file and put the directory name in the first line of the program.
```
e.g. #!/usr/bin/perl -w -I/usr/local/loginfo
```

Copy the program directory (and sub-directories) into the previous directory
```
cp -R loginfo-x.x.x/program/* /usr/local/loginfo
```
If the directory is not /usr/local/loginfo edit the apachelog.pl file and change the directory name on the first line.

Create a directory to store the output files in (within website if desired)
```
mkdir /var/www/html/webstats
```

Copy the php file into the webstats directory

cp loginfo-x.x.x/php/index.php /var/www/html/webstats

Create a config file (see the following section for changes that may be needed)
```
cp /usr/local/loginfo/sample.cfg /usr/local/loginfo/loginfo.cfg
```

Set the appropriate file permissions. This assumes that it is to run as root, if you want to run as a different user then set the permissions accordingly. See the security information for more details.
```
chown -R root:root /usr/local/loginfo/*

chmod 500 /usr/local/loginfo/apachelog.pl

chmod 600 /usr/local/loginfo/*.cfg

chmod 400 /usr/local/loginfo/Modules/*
```

If using virtual hosts then make the recommended changes to the apache configuration files. (See later section)

Create .htaccess / .htpasswd files to restrict access to the log

cd /var/www/html/webstats

vi .htaccess

(create the following entries)



AuthUserFile /.htpasswd

AuthGroupFile /dev/null

AuthName "Authorised Users Only"

AuthType Basic

require valid-user



htpasswd -c .htpasswd <username>



(you will be prompted for the users password)

Updating the LogInfo Configuration Files

The main configuration file is normally stored in the same directory as the program file, although it can be stored anywhere on the system. (e.g. in an etc directory). If there is only one on the system then it would normally be called loginfo.cfg. If using virtual hosts it may be better to have multiple configuration files, one for each virtual server in which case the filename would normally include an element of the virtual server name. See Virtual Hosts for more details.

The configuration file is written in perl format. If there is a syntax or other error then it may stop the program from running. Care is therefore required when editing the file. All entries must end with a semi-colon ; and anything after a hash # character is a comment. All entries are prefixed with our to define them as publicly available.

The following entries are used:

$title='website'

The value is used as the title of the report file. This should normally be set to the name of the website, e.g. www.watkissonline.co.uk. This is particularly important when using virtual hosts to distinguish between the different files. If this is not changed then it will still work, but the report title will not be customised.

$websitedomain

The $websitedomain value is used to filter out your own domain from the referer list. This should be set either to the domain or the hostname of the website being analyzed. For example www.watkissonline.co.uk could be added with or without the www part. This will behave slightly differently if you have multiple webservers within your domain. The parameter can be left out, but the main benefit of including it is that the percentage values for the referers will be more accurate.

$accesslog

This is the directory and filename of the apache access_log file. The default value is the directory used by Mandriva Linux and some other Linux distributions. The log file must be in the combined log format, which is the default on many systems. Refer to the Apache documentation for more details. The access_log file should be rotated on a monthly basis (at the end of each month).

$errorlog

This refers to the apache error_log file. The default value is used by Mandriva Linux and some other Linux distributions. Unlike the accesslog file the program will run if the file doesn't exist, but obviously will be unable to report on error messages. File Not Found (404) messages will be taken from the accesslog file rather than the errorlog file.

$outputfile='/var/www/html/webstats/website'

This is the filename for the report. It should include the path of the directory. The file will the year month and extension (.html or .txt) added by the program. The filename part should reflect the name of the website, especially if using virtual servers, as it will be listed in the menu page. This value also needs to be added to index.php if using the index page.

@ignoreaddress=('192.168.1.*', '10.0.0.1')

The ignoreaddress list includes any IP addresses that are to be excluded from the report. Typically this should include any addresses that the webmaster uses to test the site, and any servers that may be running automated tests to check that the server is still running. Values should be contained within single quotes and seperated with commas.
The wildcard * can be used to match any address range, e.g. 192.168.1.* matches all addresses from 192.168.1.0 to 192.168.1.255, or 192.168.* would match from 192.168.0.0 to 192.168.255.255.

@ignorefiles=('jpg', 'png', 'gif', 'ico', 'css')

Any files with extensions listed in the ignorefile lists will not be included in the statistics. The report will provide a count of the number of hits against each file, but not down to the individual file level. This is to make the report more relevant. The extensions are not case sensitive, but must appear on the end of a filename prefixed with a dot. The entries must been quoted and comma seperated.

@ignoreerrors=('robots.txt', 'favicon.ico')

Any lines in the error log matching the ignoreerrors will not be listed in the error report. This will not effect the rest of the report, only the error section. This will match on both file not found errors, and any other kind of errors. It is recommended to include just those files that are not on the server that browsers or search engines look for. Therefore robots.txt and favicon.ico are useful entries. If you have a favicon.ico and robots.txt file, you may want to remove them to ensure that you see any problems with these.

$html=1

The html variable can be either 1 or 0. A 1 will create html formatted files with the extension .html, whereas zero will create plain text files with extension .txt. Note the format of the report may change in future versions (or even become xml / xhtml).

our $filechmod='775'

If you are using your own webserver to view the report then you may need to set the filechmod value to the required permssions. This should be the octal permissions value used by chmod. A value of 775 should be suitable for most users, although depending upon the user in which you run the program 755 may be more restrictive.

@sessionscripts=('/wordpress', '/cgi-bin/customscripts')

The sessionsscripts list allows you to specify scripts that contain session information. Any scripts listed in this will have the details after the question mark ? stripped off. Instead of getting a single line for every single url used (typically one per page, per user) you will get a list of the number of times the script has been called. The example of wordpress instead of listing every different page listed, will count these into the number of entries read. The safest way to use this is to include the full path (as it appears in the URL), which will prevent it matching other directories with similar names. You can enter the value as a directory to apply to all scripts in that directory, or to an individual script file.

Advanced Settings

The following are advanced settings, please ensure you understand the implications before making any manual changes.

$debuglog=''

By specifying a filename a log will be created with any debug messages. Typically this will include details of any user agent strings not recognised, which can be used to improve the robots listing. It should normally be left commented out or blank so that no log file is created.

@robots=('Googlebot', 'Yahoo! Slurp')

Web Robots (also known as webbots) can effect the log results. To report on these separately then they should be listed in the @robots list. This is done by using the user_agent value given by the web robot. To add additional webrobots enter a string that will match against the robot, but not against a normal web browser. The word netscape would not really be a good word to include, whilst there is a netscape search engine (although I believe it uses other engines bots) it may also conflict with the user agent in a browser that is being used.

Entries should be enclosed in quotes and be comma seperated. The default list should work for most websites, although you may want to add some country specific entries. As the list is updated details will be posted on www.watkissonline.co.uk.

Updating the index.php file

After updating the loginfo config file you should also edit the index.php file if you want to use the menu to the reports. This file must be in the same directory as the output files are created. The file will need to be changed if the $outputfile variable has been changed. This is a php file so has a different format to the perl file in the standard configuration file. The most significant thing is that instead a hash character to signify a comment the php file uses two slashes // . There are only two entries that need to be changed which are:

$webservers = array('website')

Set this to the name of the $outputfile used in the earlier config file. This should be the filename after the last slash / but without the date and.html /.txt extension. If you have multiple hosts then this is comma seperated with quotes around the individual names.

$html=1

This needs the same value as used in the loginfo config file. This is used to look for files ending with .html and .txt. If you wish to have both text based and html based reports then you will need a extra config file for each and an extra copy of the index.php file.

Testing the Configuration

You can test that the configuration files are correct by running the program manually. On the command line enter the apachelog.pl command followed by the configuration file (full path names may be required). e.g.

/usr/local/loginfo/apachelog.pl /usr/local/loginfo/loginfo.cfg

You should then be able to view the report using your web browser. E.g. to view the reports on the same server use:
http://localhost/webstats/index.php

Scheduling

The program can be automated by adding it to the crontab file. To work correctly with log rotation scripts it needs to be run to complete before the log rotation scripts start and before (but close to) the 1st day of the month. This should be done a before log rotation occurs on the last day of each month. You may prefer to run it more frequently than that, perhaps on a daily basis so that you can see a partial report for the current month. The sample entry will create a scheduled task to run every day shortly before midnight. It can be run as root, but read the security implications and ensure that you understand how to secure the scripts if using root.

There is a sample file called crontab.sample which can be edited and then loaded into cron. As the user you would like the program to run as enter the following command:


crontab -l >> crontab.sample

crontab crontab.sample

The first line will copy the current crontab entries into the sample file to ensure that these are retained when the new crontab file is loaded. Use man 5 crontab to see the syntax of the crontab file.

Security Implications

To run, the program must have read access to the apache logs. The apache logs are often restricted to root only. To overcome this either the apachelog.pl program needs to be run as root, or the log files need to be changed so that another user can read them. There may be complications in the second option of changing the permission on the log files in that this would need to be included in the log rotation scripts that may differ across different platforms / distributions. For this reason the installation instructions have been written assuming that root will be accessing the file, if you have a good understanding of how the logs and log rotation work on your particular system you could overcome some of the security implications by running the program as a normal user instead of root.

There are some important implications if this program is being run automatically from cron, particularly if running as root. As the program is written in perl anyone that has write access to the program file, the module Date.pm or the configuration file can add a command that will be run under the username of the cron task. It is therefore strongly recommended that only root should have write access to any of these files. This is the reason for the chown / chmod commands needed during the installation.

Whilst some people are happy to make their webstats publicly available you may need to ensure that no personal information is released. In particular if you have cgi scripts then in the event of them issuing an error message it may include information such as the user, their ipaddress and even their password. For this reason it is strongly recommended that the log files are restricted. This can be achieved using .htaccess / .htpasswd, or could be achieved by setting it up in your httpd config file.

If you do choose to make the statistics publicly available then removing the @errorlog entry should prevent any sensitive information from being published.

Virtual Hosts

If you are running virtual hosts on your system then it may be beneficial to split the logs into seperate files. If you aren't configured for virtual hosts or don't know what virtual hosts are, and only have one website running on your server then you can ignore this.

If using virtual hosts then the logs for each of the different virtual hosts needs to be sent to seperate files. The easiest way to do this is to add the following lines to each virtual server in the Vhost.conf file.




CustomLog /var/log/httpd/websitename.access_log combined

ErrorLog /var/log/httpd/websitename.error_log

Ensuring that websitename is unique for each server. Then create a seperate loginfo config file for each virtual server, ensuring that $outputfile is unique across each different virtual host, and that each config file points at the relevant log file. You will also need to update index.php to have multiple entries and update yourlog rotation scripts accordingly. Additional crontab entries will be needed to call apachelog.pl against each of the config files. These should be entered so as each one completes before the next and all before logrotation occurs. A fifteen minute interval between each entry should be sufficient.

Sample Report

Some entries have been removed to make this easier to view.

Log Info - ApacheLog Report - www.watkissonline.co.uk

Version 0.1 alpha

Report Compiled: Thu Jun 9 15:31:30 2005

LogInfo Apachelog Log Analysis Tool

Web: http://www.watkissonline.co.uk  


Loginfo comes with ABSOLUTELY NO WARRANTY

This is free software, and you are welcome to redistribute it under certain conditions

See the User Manual and gpl.txt for more information.

Summary Information

Robots

Googlebot	105
Yahoo! Slurp	274
Netcraft Web Server Survey	8
grub	4
msnbot	210
Feedster Crawler	29

Excluded Addresses

192.168.1.*

463

Excluded Files

jpg	178
png	240
gif	4
ico	162
css	295

Page Views

Page	Number of Hits	Percentage
/linuxbluenet.html	89	29.3 %
/wordpress/	66	21.7 %
/	44	14.5 %
/cygwin.html	33	10.9 %
/linuxcommands.html	11	3.6 %
/wordpress/index.php	7	2.3 %
/ebooks/motivation.pdf	6	2.0 %
/ebooks.html	5	1.6 %
/wordpress	5	1.6 %
/info/pdf.html	4	1.3 %
/unix.html	3	1.0 %
/cv.html	2	0.7 %
/accessibility.html	1	0.3 %
/blog	1	0.3 %
/blog/	1	0.3 %

Referers

Referer	Number of Hits	Percentage
www.google.com	42	21.2 %
www.google.co.uk	25	12.6 %
ubuntuforums.org	23	11.6 %
freespace.virgin.net	21	10.6 %
search.yahoo.com	8	4.0 %
www.google.de	7	3.5 %
www.google.fr	7	3.5 %
search.msn.co.uk	5	2.5 %

Browsers

Broswer	Number of Hits	Percentage
MSIE 6.0	140	46.1 %
Firefox 1	107	35.2 %
unknown	13	4.3 %
Mozilla/5.0	8	2.6 %
MSIE 5.01	7	2.3 %
MSIE 5.5	4	1.3 %
Konqueror 3.4	3	1.0 %
Opera 8.0	3	1.0 %
MSIE 6.0b	2	0.7 %
Firefox 0.9.3	2	0.7 %
Opera 7.54 [en]	2	0.7 %
RPT-HTTPClient/0.3-3	2	0.7 %
MSIE 4.01	1	0.3 %
MSIE 5.0	1	0.3 %
Firefox 0.8	1	0.3 %
Firefox 0.9.1	1	0.3 %
IntranetExploder 08.15	1	0.3 %
Konqueror 3.0-rc2	1	0.3 %
Konqueror 3.0-rc5	1	0.3 %
Konqueror 3.3	1	0.3 %
Netscape 7.1	1	0.3 %
Safari 312	1	0.3 %
Safari 412	1	0.3 %

Operating Systems

OS	Number of Hits	Percentage
Windows NT 5.1	139	45.7 %
Linux i686	68	22.4 %
Windows NT 5.0	39	12.8 %
unknown	15	4.9 %
Windows 98	12	3.9 %
Linux i686 (x86_64	5	1.6 %
Linux	4	1.3 %
Win98	4	1.3 %
Windows NT 4.0	3	1.0 %
Linux ppc	2	0.7 %
PPC Mac OS X	2	0.7 %
Windows NT 5.2	2	0.7 %
i686 Linux	2	0.7 %
FreeBSD i386	1	0.3 %
Linux i586	1	0.3 %
Linux x86_64	1	0.3 %
PPC Mac OS X Mach-O	1	0.3 %
Windows 95	1	0.3 %
Windows CE	1	0.3 %
en	1	0.3 %

Daily Page Hits

Date & Time	Number of Hits	Percentage
01/06/2005	35	11.5%
02/06/2005	31	10.2%
03/06/2005	25	8.2%
04/06/2005	30	9.9%
05/06/2005	33	10.9%
06/06/2005	43	14.1%
07/06/2005	50	16.4%
08/06/2005	39	12.8%
09/06/2005	18	5.9%

Pages per Hour

Time	Number of Hits	Percentage
0:00 to 1:00	2	0.7%
1:00 to 2:00	1	0.3%
2:00 to 3:00	1	0.3%
3:00 to 4:00	1	0.3%
4:00 to 5:00	1	0.3%
5:00 to 6:00	2	0.7%
6:00 to 7:00	2	0.7%
7:00 to 8:00	1	0.3%
8:00 to 9:00	1	0.3%
9:00 to 10:00	3	1.0%
10:00 to 11:00	1	0.3%
11:00 to 12:00	2	0.7%
12:00 to 13:00	3	1.0%
13:00 to 14:00	2	0.7%
14:00 to 15:00	2	0.7%
15:00 to 16:00	2	0.7%
16:00 to 17:00	7	2.3%
17:00 to 18:00	4	1.3%
18:00 to 19:00	3	1.0%
19:00 to 20:00	1	0.3%
20:00 to 21:00	2	0.7%
21:00 to 22:00	2	0.7%
22:00 to 23:00	3	1.0%
23:00 to 24:00	2	0.7%

Hourly Page Hits

Date & Time	Number of Hits	Percentage
01/06/2005 11:00 to 12:00	3	1.0%
01/06/2005 12:00 to 13:00	7	2.3%
01/06/2005 13:00 to 14:00	5	1.6%
01/06/2005 15:00 to 16:00	1	0.3%
01/06/2005 16:00 to 17:00	5	1.6%
01/06/2005 17:00 to 18:00	2	0.7%
01/06/2005 18:00 to 19:00	2	0.7%
01/06/2005 19:00 to 20:00	2	0.7%
01/06/2005 20:00 to 21:00	1	0.3%
01/06/2005 21:00 to 22:00	4	1.3%
01/06/2005 22:00 to 23:00	2	0.7%
01/06/2005 23:00 to 24:00	1	0.3%
02/06/2005 0:00 to 1:00	1	0.3%
02/06/2005 1:00 to 2:00	1	0.3%
02/06/2005 2:00 to 3:00	1	0.3%
02/06/2005 3:00 to 4:00	2	0.7%
02/06/2005 7:00 to 8:00	1	0.3%
02/06/2005 8:00 to 9:00	2	0.7%
02/06/2005 10:00 to 11:00	3	1.0%
02/06/2005 11:00 to 12:00	2	0.7%
02/06/2005 12:00 to 13:00	1	0.3%
02/06/2005 13:00 to 14:00	3	1.0%
02/06/2005 14:00 to 15:00	2	0.7%
02/06/2005 15:00 to 16:00	3	1.0%
02/06/2005 17:00 to 18:00	1	0.3%
02/06/2005 18:00 to 19:00	1	0.3%
02/06/2005 19:00 to 20:00	1	0.3%
02/06/2005 20:00 to 21:00	3	1.0%
02/06/2005 21:00 to 22:00	3	1.0%
03/06/2005 2:00 to 3:00	1	0.3%
03/06/2005 4:00 to 5:00	1	0.3%
03/06/2005 5:00 to 6:00	1	0.3%
03/06/2005 7:00 to 8:00	2	0.7%
09/06/2005 11:00 to 12:00	2	0.7%
09/06/2005 12:00 to 13:00	3	1.0%
09/06/2005 13:00 to 14:00	2	0.7%
09/06/2005 14:00 to 15:00	2	0.7%
09/06/2005 15:00 to 16:00	2	0.7%

Error Reports

Missing Files

Filename	Number of Errors
test.html	3

Filename

Number of Errors

test.html

Other Errors

Error	Number of Errors
[client 200.185.234.145] script not found or unable to stat	2
script not found or unable to stat: /data/www/cgi-bin/openwebmail	2

Error

Number of Errors

[client 200.185.234.145] script not found or unable to stat

script not found or unable to stat: /data/www/cgi-bin/openwebmail

Linux Loginfo - Apache basic log analysis software

About

Manual / Example

Download

Updates in latest version

Version 0.1.2 beta

Version 0.1.1 beta

Patches / Config Updates

1st July 2005 (as included in version 0.1.1)

17th June 2005

9th June 2005 (as included in version 0.1.0)

Old Versions

Loginfo Manual

Simple Web Log Analysis

User Manual

About

Features

History

License

Future Developments

Bugs & Limitations

Installation

Pre-Requisites

Time::ParseDate Module

Upgrading from an Earlier Version

Downloading the File

Installing the files

Updating the LogInfo Configuration Files

$title='website'

$websitedomain

$accesslog

$errorlog

$outputfile='/var/www/html/webstats/website'

@ignoreaddress=('192.168.1.*', '10.0.0.1')

@ignorefiles=('jpg', 'png', 'gif', 'ico', 'css')

@ignoreerrors=('robots.txt', 'favicon.ico')

$html=1

our $filechmod='775'

@sessionscripts=('/wordpress', '/cgi-bin/customscripts')

Advanced Settings

$debuglog=''

@robots=('Googlebot', 'Yahoo! Slurp')

Updating the index.php file

$webservers = array('website')

$html=1

Testing the Configuration

Scheduling

Security Implications

Virtual Hosts

Sample Report

Log Info - ApacheLog Report - www.watkissonline.co.uk

Summary Information

Robots

Excluded Addresses

Excluded Files

Page Views

Referers

Browsers

Operating Systems

Daily Page Hits

Pages per Hour

Hourly Page Hits

Error Reports

Missing Files

Other Errors

Related links