domains
domains copied to clipboard
Do not download large files such as FLAC and MP3
Downloading these places a significant load on servers, and most are not going to contain URL metadata of use to the project.
This is probably true of image files too.
At least please explicitly describe a suitable robots.txt User-agent name to stop the tool scraping inappropriate sites/subtrees.
Rgds
Damon
Hi,
Is it GET /some/large/file.mp3 or just HEAD ?
Thanks
GET, eg:
"GET /img/audio/AudioMoth/20210402/20210402T1827Z-desk-ambient-AudioMoth-384ksps.flac HTTP/2.0" 200 18384032 "-" "Mozilla/5.0 (compatible; Domains Project/1.3.7; +https://domainsproject.org)"
Got it, it's a duplicate of #28