JustAnotherArchivist
JustAnotherArchivist
Sometimes, the dashboard doesn't show all log messages. This happens in particular at the end of the job. For example, it might drop some of the "Started/Finished X for item"...
When adding an ignore, existing requests for matching URLs should be aborted immediately, or more precisely, the corresponding connection should be closed. This might require support from upstream wpull. Use...
This classifies as a bug because it seems to prevent other cookies from being set. For example, this breaks Blogger's "content warning" thingy, see job 7c6i9gqmp0al3kuk27vat3dyv.
This just happened on pipeline:a519b67335426a5c4296e4df9049d7d5: ``` Starting CheckIP for Item Checking IP address. Finished CheckIP for Item Starting GetItemFromQueue for Item Received item 3jaycce28r7ws0rjznlaw15a0. Starting StartHeartbeat for Item Finished StartHeartbeat...
There are a number of dependencies between the Redis keys, and something (cogs?) should run regular consistency checks. Two examples: * All jobs are in a pending queue, in the...
The current integration test only runs a single `!ao` job and checks merely that at least one WARC exists. Some things that should be added: - [ ] Check whether...
Several pipelines are also running a web server on the same machine without blocking the ArchiveBot wpull processes from accessing that web server via either localhost or 127.0.0.0/8. Since ignores...
@hook54321a brought up on IRC that a pipeline retrieved http://www/ successfully ([Wayback Machine](https://web.archive.org/web/20180511142946/http://www/)). The reason for this is that there's a `search ovh.net` line on the pipeline's `resolv.conf`, meaning that...
We should set up automatic regression testing. Specifically, we need to add a test that upgrading the pipeline and/or backend will not break things (unless that is expected). This is...
https://techpatterns.com/forums/about304.html has a decent list of such user agents. (Thanks Rotxer)