ArchiveBot
ArchiveBot copied to clipboard
ArchiveBot, an IRC bot for archiving websites
Both of these are 406. `/sponsor_button` only appears on some repos, such as [this one](https://github.com/devkitPro/libogc/). `/hovercards/citation/sidebar_partial?tree_name=master` appears everywhere (but seems to usually be a 204 No Content outside of ArchiveBot)....
I do hope I am doing this right I suck at Github but assuming I have done this right it should add WiiU useragents to Archivebot for the purpose of...
Before: 
The WebSocket server causes a lot of load. Before #566, it regularly pinned a core on the AT instance. That change improves the situation, but it's not a real solution,...
Since sometime recently, various websites frequently cause 6-hour stalls. It's unclear when this started exactly, but it became very noticeable (i.e. multiple jobs hanging every few hours) in October, I...
Currently when a redirect happens it looks like this in the dashboards: ``` 302 OK https://bugzilla.redhat.com/attachment.cgi?id=461634 200 OK https://bugzilla.redhat.com/attachment.cgi?id=461634 ``` Note that the second URL displayed there is not the...
These are often generated by the [Tribe](https://tri.be/our-work/the-events-calendar/) [events calendar](https://theeventscalendar.com/) on wordpress sites, and always redirect to a generic login page (or an information page for the outlook.live.com variant). These pages...
The ArchiveBot web server currently does not compress the ~2MB `/logs/recent` response, even when the browser sends `Accept-Encoding: gzip, deflate`. This change is intended to speed up the loading of...
Added the above to the user-agents list. `googlecrawlers.json` contains most of Google's spiders and crawlers. `bingbot.json` and `python-requests.json` are self-explanatory (I hope). `bingbot.json` and `googlecrawlers.json` contain Chrome version numbers, which...
The site is long dead and just redirects to Yahoo now, this slows down archiving of very old blogs/forums etc.