domains Do not download large files such as FLAC and MP3

Do not download large files such as FLAC and MP3

Open DamonHD opened this issue 1 year ago • 3 comments

Downloading these places a significant load on servers, and most are not going to contain URL metadata of use to the project.

This is probably true of image files too.

At least please explicitly describe a suitable robots.txt User-agent name to stop the tool scraping inappropriate sites/subtrees.

Rgds

Damon

Feb 15 '24 11:02 DamonHD

Hi,

Is it GET /some/large/file.mp3 or just HEAD ?

Thanks

Feb 15 '24 15:02 tb0hdan

GET, eg:

"GET /img/audio/AudioMoth/20210402/20210402T1827Z-desk-ambient-AudioMoth-384ksps.flac HTTP/2.0" 200 18384032 "-" "Mozilla/5.0 (compatible; Domains Project/1.3.7; +https://domainsproject.org)"

Feb 15 '24 16:02 DamonHD

Got it, it's a duplicate of #28

Feb 15 '24 16:02 tb0hdan

domains domains copied to clipboard

Do not download large files such as FLAC and MP3

domains
domains copied to clipboard