wayback-machine-downloader icon indicating copy to clipboard operation
wayback-machine-downloader copied to clipboard

UserAgent in parameters

Open cyber01 opened this issue 4 years ago • 5 comments

Possible solution to access restriction problems (502-504, 403 HTTP codes) related to blocking most UserAgents by default (curl, pythonlib, Ruby). With this parameter, you can "disguise" as a browser and eventually bypass the restriction. In this way, 350 thousand pages of one of the sites were previously downloaded (full history from 2008)

cyber01 avatar Sep 23 '20 13:09 cyber01

Sounds logical. Changed the default to useragent Firefox 80 on Windows 10

cyber01 avatar Nov 05 '20 14:11 cyber01

Maybe, another suggestion... How about adding 'DNT: 1' headers by default? Not sure if it's something that IA_ARCHIVER cares about thi

mathieu-aubin avatar Nov 06 '20 13:11 mathieu-aubin

Maybe, another suggestion... How about adding 'DNT: 1' headers by default? Not sure if it's something that IA_ARCHIVER cares about thi

A good suggestion, but I think it's better to do it in a separate MR, where you can make some more adjustments to privacy, or to bypass locks.

cyber01 avatar Nov 06 '20 14:11 cyber01

would it make sense to add a commandline flag to set a user agent along with defaulting to something like firefox or chrome?

sww1235 avatar Nov 20 '23 04:11 sww1235

i believe it does

mathieu-aubin avatar Nov 25 '23 22:11 mathieu-aubin