apache-ultimate-bad-bot-blocker icon indicating copy to clipboard operation
apache-ultimate-bad-bot-blocker copied to clipboard

Apache Mod Rewrite rules didnt catch some bots

Open ghost opened this issue 8 years ago • 3 comments

Hi @mitchellkrogza

I think there are a little issue witch mod rewrite version. RewriteCond starting with ^, its meaning this word has been on start position. If user agent starting with this word, this rule is work very well.

For Example this rule: RewriteCond %{HTTP_USER_AGENT} ^ADmantX.* [NC,OR] catch admantx-adform/2.4 (+http://www.admantx.com/service-fetcher.html)

If User Agent didn't starting with this word, RewriteCond didn't cach anything. For Example this rule: RewriteCond %{HTTP_USER_AGENT} ^Trendiction.* [NC,OR] never catch Mozilla/5.0 (Windows; U; Windows NT 6.0; en-GB; rv:1.0; trendictionbot0.5.0; trendiction search; http://www.trendiction.de/bot; please let us know of any problems; web at trendiction.com) Gecko/20071127 Firefox/3.0.0.11

My opinion is very simple: Some RewriteCond to be start with .*

This rule: RewriteCond %{HTTP_USER_AGENT} ^Trendiction.* [NC,OR] To be: RewriteCond %{HTTP_USER_AGENT} .*Trendictionbot.* [NC,OR]

Note: I tested my opinion and it worked with some issues. fdm / fq / disco catched many wrong user agents.

ghost avatar Nov 08 '17 21:11 ghost

Hi @yakusha thanks for your feedback on this. I think a word boundary regex will work better. Can you try this for me and let me know how it works?

\bTrendictionBot.*

or this

\bTrendictionBot\b

mitchellkrogza avatar Nov 12 '17 08:11 mitchellkrogza

@yakusha please have a look at this issue > https://github.com/mitchellkrogza/apache-ultimate-bad-bot-blocker/issues/50 we think better / correct naming of the bots is the best way forward to catch them and will lead to less false positives.

mitchellkrogza avatar Nov 13 '17 08:11 mitchellkrogza

FYI: This was updated and fixed in #81 👍

davcpas1234 avatar Aug 26 '18 14:08 davcpas1234