internet-archiving topic
waybackpy
Wayback Machine API interface & a command-line tool
ArchiveBox
🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...
vandal
Navigator for Web Archive
wikipedia-mirror
🌐 Guide and tools to run a full offline mirror of Wikipedia.org with three different approaches: Nginx caching proxy, Kiwix + ZIM dump, and MediaWiki/XOWA + XML dump
good-karma-kit
😇 A Docker Compose bundle to run on servers with spare CPU, RAM, disk, and bandwidth to help the world. Includes Tor, ArchiveWarrior, BOINC, and more...
electron-archivebox
Desktop Electron app for ArchiveBox internet archiver. (ALPHA: not ready for general use)
internet-archiving-talk
🎭 An introduction to the Internet Archiving ecosystem, tooling, and some of the ethical dilemmas that the community faces.
docker-archivebox
Home of the official docker image for ArchiveBox
homebrew-archivebox
Homebrew formula for the ArchiveBox self-hosted internet archiving solution.
readability-extractor
Javascript/Node wrapper around Mozilla's Readability library so that ArchiveBox can call it as a oneshot CLI command to extract each page's article text.