lala copied to clipboard
:earth_americas: Analyze and generate reports of web logs (NGINX)
.. image:: :width: 200 px :alt: alternate text :align: center
.. image:: :target: :alt: Travis CI build status
.. image:: :target:
Lala is a Python library for access log analysis. It provides a set of methods to retrieve, parse and analyze access logs (only from NGINX for now), and makes it easy to plot geo-localization or time-series data. Think of it as a simpler, Python-automatable version of Google Analytics, to make reports like this:
.. image:: :width: 550 px :alt: alternate text :align: center
.. code:: python
from lala import WebLogs
weblogs, errored_lines = WebLogs.from_nginx_weblogs('access_logs.txt')
Similarly, to fetch logs on a distant server (for which you have access keys) you would write:
.. code:: python
from lala import get_remote_file_content, WebLogs
logs= lala.get_remote_file_content(
host="", user='root',
weblogs, errors = WebLogs.from_nginx_weblogs(logs.split('\n'))
Now weblogs
is a scpecial kind of Pandas <>
_ dataframe where each row is one server access, with fields such as IP
, date
, referrer
, country_name
, etc.
.. image:: :width: 800 px :alt: alternate text :align: center
The web logs can therefore be analyzed using any of Pandas' built-in filtering and plotting functions. The WebLogs
class also provides additional methods which are particularly useful to analyse web logs, for instance to plot pie-charts:
.. code:: python
ax, country_values = weblogs.plot_piechart('country_name')
.. image:: :width: 300 px :alt: alternate text :align: center
Next we plot the location (cities) providing the most connexions:
.. code:: python
ax = weblogs.plot_geo_positions()
.. image:: :width: 700 px :alt: alternate text :align: center
We can also restrict the entries to the UK, and plot a timeline of connexions:
.. code:: python
uk_entries = weblogs[weblogs.country_name == 'United Kingdom']
ax = uk_entries.plot_timeline(bins_per_day=2)
.. image:: :width: 700 px :alt: alternate text :align: center
Here is how to get the visitors a list of visitors and visits, sort out the most frequent visitors, find their locations, and plot it all:
.. code:: python
visitors = weblogs.visitors_and_visits()
visitors_locations = weblogs.visitors_locations()
frequent_visitors = weblogs.most_frequent_visitors(n_visitors=5)
ax = weblogs.plot_most_frequent_visitors(n_visitors=5)
.. image:: :width: 450 px :alt: alternate text :align: center
Lala can do more, such as identifying the domain name of the visitors, which can be used to filter out the robots of search engines:
.. code:: python
filtered_entries = weblogs.filter_by_text_search(
terms=['googlebot', '', 'baidu', 'msnbot'],
Lala also plays nicely with the PDF Reports <>
_ library to let you define report templates such as this one <>
_ (written in Pug), and then generate this PDF report <>
_ with the following code:
.. code:: python
You can install lala through PIP
.. code:: bash
sudo pip install python-lala
Alternatively, you can unzip the sources in a folder and type
.. code:: bash
sudo python install
For plotting maps you will need Cartopy which is not always easy to install - it may depend on your system. If you are on Ubuntu 16+, first install the dependencies with:
.. code:: bash
sudo apt-get install libproj-dev proj-bin proj-data libgeos-dev
sudo pip install cython
License = MIT
lala is an open-source software originally written at the Edinburgh Genome Foundry <>
_ by Zulko <>
_ and released on Github <>
_ under the MIT licence (Copyright 2018 Edinburgh Genome Foundry).
Everyone is welcome to contribute!