Crawler-Detect
Crawler-Detect copied to clipboard
Potential bots
- [x] Filestack
- [x] Google-Ads-Overview Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.118 Safari/537.36
- [x] Google-Ads-Overview Mozilla/5.0 (Linux; U; Android 6.0.1; generic) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.118 Version/4.0 Mobile Safari/537.36
- [x] Google-Ads-Overview Mozilla/5.0 (Linux; U; Android 2.3.4; generic) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.118 Version/4.0 Mobile Safari/537.36
- [x] Google-Ads-Overview Mozilla/5.0 (Linux; U; Android 2.3.4; generic) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Mobile Safari/537.36
- [ ] Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/602.1 (KHTML, like Gecko) splash Version/9.0 Safari/602.1
- [x] adreview/1.0
- [x] Mozilla/5.0 (compatible; RyowlEngine/1.0; +https://ryowl.org)
- [x] Mozilla/5.0 (compatible; RyowlEngine/1.0; +https://ryowl.com)
- [x] Google-speakr
- [x] Google-speakr,gzip(gfe)
- [x] FeedViewer/1.0 (+http://www.feedviewer.net/webmasters; license agreement: http://www.feedviewer.net/license)
- [x] acebookexternalhit/1.0 (+http://www.facebook.com/externalhit_uatext.php)
- [x] WhoAPI/1.0 (whoapi.com)
- [x] Mozilla/5.0 (compatible; BackupLand/1.0; https://go.backupland.com/; Domain check for viruses;)
- [x] Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:66.0) WhatCMS/1.0
- [ ] Google-Ads-Overview Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.118 Safari/537.36
- [ ] Google-Ads-Overview Mozilla/5.0 (Linux; U; Android 6.0.1; generic) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.118 Version/4.0 Mobile Safari/537.36
- [ ] Google-Ads-Overview Mozilla/5.0 (Linux; U; Android 2.3.4; generic) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.118 Version/4.0 Mobile Safari/537.36
- [ ] Google-Ads-Overview Mozilla/5.0 (Linux; U; Android 2.3.4; generic) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Mobile Safari/537.36
- [ ] Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) DownloaderChrome/62.0.3202.75 Safari/537.36
- [x] iGooglePortal
- [x] Mozilla/5.0+(compatible; Cula/2.0; https://cula.io/)
- [ ] Mozilla/5.0 (Windows; U; Windows NT 6.1; en-us; rv:1.9.2.3) Gecko/20100401 YFF35 Firefox/3.6.3
- [x] Owlin - http://www.owlin.com
- [x] Mozilla/5.0 (compatible; +centuryb.o.t9[at]gmail.com)
- [ ] Bublup (+https://www.bublup.com/bublup.html)
- [ ] Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36 | Hexometer.com - HexAct Inc.
- [ ] Mozilla/5.0/Firefox/42.0 - nbertaupete95(at)gmail.com
- [ ] OpenGraphCheck/2.1 (+https://opengraphcheck.com)
- [ ] donwload_html/2.0 (Linux) [email protected]
- [ ] LinuxGetURL/2.0 [email protected] (Linux)
- [ ] Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Google-AMPHTML)
- [ ] Google-AMPHTML
- [ ] inactive-blog-skipper/1.0 ([email protected])
- [ ] AWS Network Health / Contact [email protected] with your website URL to stop
- [ ] AWS Network Health / Contact [email protected] with your website URL to stop
- [ ] Corax - [email protected]
- [ ] draw.io
- [ ] MindsMediaProxy/3.0 (+http://www.minds.com/)
- [ ] Mozilla/5.0 (w3dt header analysis for httprecon tools; http://w3dt.net/tools/httprecon)
- [ ] Google-Test
- [ ] Mozilla/5.0 (compatible; Google-Test;)
- [ ] Mozilla/5.0 (compatible; RSiteAuditor)
- [ ] Mozilla/5.0 (compatible; WPSec/1.3; +https://wpsec.com)
- [ ] Mozilla/5.0 (compatible; Go-KI; +https://www.gosign.de/)
- [ ] Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Google-AMPHTML)
- [ ] Google-AMPHTML
- [ ] Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome-prerendercloud/66.0.3359.139 Safari/537.36
- [ ] DIGMATO.com web tester
- [ ] Mozilla/5.0 (X11; Linux x86_64; Rigor) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.87 Safari/537.36
- [ ] Mozilla/5.0 Windows NT 10.0; Win64; x64 AppleWebKit/537.36 KHTML, like Gecko Chrome/65.0.3286.0 Safari/537.36 Rigor
- [ ] Mozilla/5.0 (X11; Linux x86_64; rv:61.0) Gecko/20100101 Firefox/61.0 (Research project: Visit PrivacyScore.org for details)
- [ ] veu/1.0 (+http://www.veu.cat)
- [ ] Google-Cloud-ML-Vision
- [ ] FirmoGraph (+https://firmograph.io)
- [ ] Mozilla/5.0 (compatible; 2GDPR/1.2; https://2gdpr.com)
- [ ] CityGridMedia/1.0 (compatible; http://url-validation.citygrid.com/)
- [ ] Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_4) AppleWebKit/537.11 (KHTML, like Gecko)(compatible; http://url-validation.citygrid.com/) Chrome/23.0.1271.95 Safari/537.11
- [ ] https://gdnplus.com:Gather Analyze Provide.
- [ ] northcutt.com SEO tools
- [ ] Burf.co
- [ ] Mozilla/5.0 (compatible; WPSec/1.3; +https://wpsec.com)
- [ ] gensun.org
Is this merged?
Is this merged?
The user-agents marked with ✅ have been added, the others need adding 👍🏻
This is the UserAgent of the Google-Weblight bot:
- [ ] Mozilla/5.0 (Linux; Android 4.2.1; en-us; Nexus 5 Build/JOP40D) AppleWebKit/535.19 (KHTML, like Gecko; googleweblight) Chrome/38.0.1025.166 Mobile Safari/535.19 Should be detectable by "googleweblight"
There's also:
Mozilla/5.0 AppleWebKit/537.36 Chrome/114.0.5735.179 Safari/537.36 Google-Ads-Conversions
Should these 2 existing rules be replaced:
- Google-Ads-Creatives-Assistant
- Google-Ads-Overview
with a simple "Google-Ads" detection?
There's also:
Mozilla/5.0 AppleWebKit/537.36 Chrome/114.0.5735.179 Safari/537.36 Google-Ads-Conversions
Should these 2 existing rules be replaced:
- Google-Ads-Creatives-Assistant
- Google-Ads-Overview
with a simple "Google-Ads" detection?
Yeah, go for it 👍
Probably no way to detect but these 2 visit my entirely Danish site every day... The first twice a day from the US and the second once a day from China. These are all the useragent headers and all of it seems to be removed via excludes.
Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.142 Safari/537.36
Probably no way to detect but these 2 visit my entirely Danish site every day... The first twice a day from the US and the second once a day from China. These are all the useragent headers and all of it seems to be removed via excludes.
Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.142 Safari/537.36
Yep, pretty annoying bots like this. Nothing this package can do about that 🤔
I found this list if anyone's interested in going through it ;-P https://user-agents.net/bots
I don't have enough experience with regex to do it myself sadly... As my original post showed (hadn't noticed the bot I mentioned already would get catched by the "bot" in the regex).