ArchiveBot
ArchiveBot copied to clipboard
ArchiveBot, an IRC bot for archiving websites
ArchiveBot does not retry `[Errno 111] Connection refused` errors. I think it should and consider this a bug. This simply requires adding the wpull option `--retry-connrefused`.
* Facebook doesn't like our UA. * Twitter requires the AB UA for useful archival. * Tumblr requires a browser UA. * Flickr blocks AB UA requests with a 503....
On all requests that match `https?://[^/]+\.reddit\.com(/|$)`, we should send a `Cookie: over18=1` header so that we always get the content instead of the age wall. Perhaps something for the new...
See https://twitter.com/atarchivebot - one example is c2.com.
Some cookies or cookie values have bad effects on the archival. For example, many classical forum softwares let the user choose between different view modes (linear, threaded, hybrid), styles, or...
While viewing currently ignored URLs using the !igon command, the ability to quickly unignore them using the right-click feature in the Dashboard should be considered.
I just realised that a feature we've been talking about for years in `#archivebot` still isn't filed here: bulk ignore handling. The issue at hand is that wpull is fairly...
On job 6g7jcc64ct3ad8izr4dz82xdl, there were some URLs containing braces (`{` and `}`). This was displayed correctly in the log window, but when copying ignore patterns from the context menu, `%7B`...
Recently, on some pipelines, aborting a job started throwing logging exceptions at the end of `NameError: name 'open' is not defined`. This exception is raised in the logging's `__init__.py` inside...
A basic ignore for [ikiwiki](https://ikiwiki.info/) /ikiwiki\.cgi\?(.*&)?do=(create|edit|revert)(&|$) Thanks, @anarcat!