wayback-machine-downloader
wayback-machine-downloader copied to clipboard
UserAgent in parameters
Possible solution to access restriction problems (502-504, 403 HTTP codes) related to blocking most UserAgents by default (curl, pythonlib, Ruby). With this parameter, you can "disguise" as a browser and eventually bypass the restriction. In this way, 350 thousand pages of one of the sites were previously downloaded (full history from 2008)
Sounds logical. Changed the default to useragent Firefox 80 on Windows 10
Maybe, another suggestion... How about adding 'DNT: 1' headers by default? Not sure if it's something that IA_ARCHIVER cares about thi
Maybe, another suggestion... How about adding 'DNT: 1' headers by default? Not sure if it's something that IA_ARCHIVER cares about thi
A good suggestion, but I think it's better to do it in a separate MR, where you can make some more adjustments to privacy, or to bypass locks.
would it make sense to add a commandline flag to set a user agent along with defaulting to something like firefox or chrome?
i believe it does