Crawler-Detect icon indicating copy to clipboard operation
Crawler-Detect copied to clipboard

Some Google bots are not identified

Open anemone-clown opened this issue 10 months ago • 1 comments

Hi, it seems some Google Bot from cloud are not identified as it. Example: IP = 104.199.13.48 | Referer = | Lang = | Host = 48.13.199.104.bc.googleusercontent.com | Nav = Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.85 Safari/537.36 | Translate = 0 | Bot = 0

Because I've a lot of bot, I verify (first) if $_SERVER['HTTP_ACCEPT_LANGUAGE'] is empty, and (second), I do gethostbyaddr($_SERVER['REMOTE_ADDR']). If Google is present in host, it's a google cloud bot.

Is it possible to detect this? Jef (sorry for my poor english...)

anemone-clown avatar Sep 01 '23 10:09 anemone-clown

I guess you could add a rule matching googleusercontent. I haven't seen this bot yet.

But recently I'm seeing requests from GoogleOther:

Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GoogleOther) Chrome/117.0.5938.132 Safari/537.36
Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.5938.132 Mobile Safari/537.36 (compatible; GoogleOther)

clementmas avatar Oct 19 '23 03:10 clementmas

Same here:

Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.6261.94 Mobile Safari/537.36 (compatible; GoogleOther)

BartVB avatar Apr 10 '24 07:04 BartVB

Is this package still maintained? Do you know some similar alternatives?

sylvaindeloux avatar Apr 23 '24 10:04 sylvaindeloux

Yes, still maintained

PRs welcome 🙏🏻

JayBizzle avatar Apr 23 '24 13:04 JayBizzle

Amazing, thanks a lot, @JayBizzle !

BartVB avatar Apr 23 '24 18:04 BartVB