internet-archiving topic

List internet-archiving repositories

waybackpy

441
Stars
33
Forks
Watchers

Wayback Machine API interface & a command-line tool

ArchiveBox

20.1k
Stars
1.1k
Forks
172
Watchers

🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...

wikipedia-mirror

331
Stars
29
Forks
Watchers

🌐 Guide and tools to run a full offline mirror of Wikipedia.org with three different approaches: Nginx caching proxy, Kiwix + ZIM dump, and MediaWiki/XOWA + XML dump

good-karma-kit

298
Stars
8
Forks
Watchers

😇 A Docker Compose bundle to run on servers with spare CPU, RAM, disk, and bandwidth to help the world. Includes Tor, ArchiveWarrior, BOINC, and more...

electron-archivebox

173
Stars
15
Forks
Watchers

Desktop Electron app for ArchiveBox internet archiver. (ALPHA: not ready for general use)

internet-archiving-talk

47
Stars
5
Forks
Watchers

🎭 An introduction to the Internet Archiving ecosystem, tooling, and some of the ethical dilemmas that the community faces.

docker-archivebox

45
Stars
12
Forks
Watchers

Home of the official docker image for ArchiveBox

homebrew-archivebox

25
Stars
3
Forks
Watchers

Homebrew formula for the ArchiveBox self-hosted internet archiving solution.

readability-extractor

33
Stars
13
Forks
Watchers

Javascript/Node wrapper around Mozilla's Readability library so that ArchiveBox can call it as a oneshot CLI command to extract each page's article text.