nginx-ultimate-bad-bot-blocker icon indicating copy to clipboard operation
nginx-ultimate-bad-bot-blocker copied to clipboard

[Referrer-Domain] Microsoft Search Engine Spiders are blocked!

Open HKPhysicist opened this issue 2 years ago • 11 comments

My sites have joined MS Clarity and MS search spiders began to crawl my sites frequently. nginxrepeatoffender began to jail their IPs. Their general domain names are: msnbot-xxx-xxx-xxx-xxx.search.msn.com

How do I whitelist them? In where xxx-xxx-xxx-xxx is a general IPv4 IP address.

HKPhysicist avatar Sep 30 '23 19:09 HKPhysicist

Please post some log line examples

mitchellkrogza avatar Mar 30 '24 04:03 mitchellkrogza

Hello, Here are some IPs from .search.msn.com which I saw today on fail2ban log files. They are not the same every time.

2024-03-30 21:48:43,595 fail2ban.filter [743]: INFO [nginxrepeatoffender] Found 40.77.167.41 - 2024-03-30 21:48:43 2024-03-30 21:53:24,688 fail2ban.filter [743]: INFO [nginxrepeatoffender] Found 52.167.144.20 - 2024-03-30 21:53:24 2024-03-30 23:12:03,360 fail2ban.filter [743]: INFO [nginxrepeatoffender] Found 52.167.144.20 - 2024-03-30 23:12:02 2024-03-30 23:12:51,766 fail2ban.filter [743]: INFO [nginxrepeatoffender] Found 52.167.144.20 - 2024-03-30 23:12:51 2024-03-30 23:12:52,299 fail2ban.actions [743]: NOTICE [nginxrepeatoffender] Ban 52.167.144.20

HKPhysicist avatar Mar 30 '24 15:03 HKPhysicist

any one may try this regex for MS Clarity and MS Search but I am not sure it is 100% correct.

"~*(?:\b)msnbot-'\b((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(-|$)){4}\b'.search.msn.com(?:\b)" 0;

HKPhysicist avatar Jul 19 '24 17:07 HKPhysicist

another one gives these to check IPv4 ^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$

HKPhysicist avatar Jul 19 '24 17:07 HKPhysicist