AWStats
AWStats copied to clipboard
Robot detection based on hits?
I have noticed an increasing number of bad robots that don't identify themselves as robots. Typically they will fetch a sites root/home page html file and noting else. This can be seen in the Hosts (IP) report where you can see 1 page and 1 hit or 2 pages and 2 hits etc against a Host/IP. Checking the raw log files confirms this and that the user-agent has no bot indentification in it. Unfortunately these bad bots are added to the unique visitors count when they should infact be added to unidentified robots count. I appreciate this is tricky to catch in awstats especially since it could be a visitor coming back and most of the hit files are already in the users browser cache. However its pretty obvious these visits are not real visitors and the volume of them and their regular visits is very large. Something like 30% of visitors on one site I look after and similar on a couple of others. This completely distorts the stats giving the impression of far more real visitors than there actually are.
Is there anyway to modify awstats to incoporate a configurable conf file option to say a page must have x amount of hits on it to be considered a real visitor otherwise its an unidentified robot? Most pages these days will have at least half a dozen or more file hits on them so the data is already in awstats program. How easy that is to implement may be another matter, I don't know.
Thank you! I appreciate referencing the same issue from my side.