The idea behind this forthcoming application Beaver is to make a tool which can interactively analyse web server logs. Version 0.2, a preliminary test version, is now available for download.
The basic idea behind Beaver is this. By reading the files which the web server writes for every web request, it's possible to glean all sorts of useful information about your web site, such as:
So far, so good. All this is fairly standard and is already done by many useful tools such as webalizer, analog, visitors and other free and not-so-free tools. The activityworkshop currently uses webalizer, and uses filelight to visualize the visit statistics to see which areas of the site are most popular (see the about page for example plots from filelight).
There's a big disadvantage to all of the approaches I've seen so far - they're reporting tools, not analysis tools. They generate daily, weekly or monthly reports, often in HTML format, with very pretty graphs and charts and tables but they only show what you thought in advance you'd want to extract from the logs. There's a limit to how much information you can present on such static reports and how deep you can go.
That's why this new application aims not to produce reports but really to dig into the data. It will hopefully be able to answer more detailed, spontaneous questions, like:
Enumerating all the possible alternatives of what you might be interested in just isn't possible with a static report. You need to be able to drill-down, to filter, to correlate. You need to be able to click on things and narrow down what's going on, interactively. And that's what Beaver should do.
It should also be cross-platform, but shouldn't have to run on the web server. It would be nice to be able to download the logs from the server and work on them anywhere, without restrictions on what can be installed on the web server.
Beaver gnawing a logpile
As initial stages in the development, firstly some prototypes were built to demonstrate some ideas of visualisation and presentation, including diagrams with matplotlib. As part of this effort, some PyQt notes were written to illustrate some aspects of Python, Qt, PyQt and QtDesigner integration.
After the first release of version 0.1 in October 2009, the second release, version 0.2, is now available from the download page. The current status of development is shown in the development page, including the latest screenshots and progress notes.
By throwing around these ideas, it will hopefully generate some feedback or help. The program is released under the GPL, so any contributions will be gratefully received, acknowledged, and redistributed to the community. Maybe you have some ideas, maybe you want to suggest a feature, maybe you're a python expert and want to help!
Some of the basic questions arising at the beginning have now been answered: yes it will use Python and Qt, and yes it will be multilingual (as long as translators can be found). Probably matplotlib will be used for the pie charts, histograms, and hierarchical plots like filelight/baobab. However, no output options have been decided so far. Some of these issues are discussed in the development page.
Because, like a beaver, it gnaws your logs. Not just reading, or reporting, but really getting its teeth in there. This terrible pun will be utilised extensively, be warned.