apache-ultimate-bad-bot-blocker
apache-ultimate-bad-bot-blocker copied to clipboard
Apache Mod Rewrite rules didnt catch some bots
Hi @mitchellkrogza
I think there are a little issue witch mod rewrite version. RewriteCond starting with ^, its meaning this word has been on start position. If user agent starting with this word, this rule is work very well.
For Example this rule: RewriteCond %{HTTP_USER_AGENT} ^ADmantX.* [NC,OR] catch admantx-adform/2.4 (+http://www.admantx.com/service-fetcher.html)
If User Agent didn't starting with this word, RewriteCond didn't cach anything. For Example this rule: RewriteCond %{HTTP_USER_AGENT} ^Trendiction.* [NC,OR] never catch Mozilla/5.0 (Windows; U; Windows NT 6.0; en-GB; rv:1.0; trendictionbot0.5.0; trendiction search; http://www.trendiction.de/bot; please let us know of any problems; web at trendiction.com) Gecko/20071127 Firefox/3.0.0.11
My opinion is very simple: Some RewriteCond to be start with .*
This rule: RewriteCond %{HTTP_USER_AGENT} ^Trendiction.* [NC,OR]
To be: RewriteCond %{HTTP_USER_AGENT} .*Trendictionbot.* [NC,OR]
Note: I tested my opinion and it worked with some issues. fdm / fq / disco catched many wrong user agents.
Hi @yakusha thanks for your feedback on this. I think a word boundary regex will work better. Can you try this for me and let me know how it works?
\bTrendictionBot.*
or this
\bTrendictionBot\b
@yakusha please have a look at this issue > https://github.com/mitchellkrogza/apache-ultimate-bad-bot-blocker/issues/50 we think better / correct naming of the bots is the best way forward to catch them and will lead to less false positives.
FYI: This was updated and fixed in #81 👍