webarchiving topic

List webarchiving repositories

awesome-memento

77
Stars
8
Forks
Watchers

A list of things related to software, literature, and other content for 🕣 Memento

waybackpy

441
Stars
33
Forks
Watchers

Wayback Machine API interface & a command-line tool

Squidwarc

164
Stars
26
Forks
Watchers

Squidwarc is a high fidelity, user scriptable, archival crawler that uses Chrome or Chromium with or without a head

awesome-web-archiving

1.8k
Stars
151
Forks
Watchers

An Awesome List for getting started with web archiving

wget-lua

83
Stars
14
Forks
Watchers

Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.

node-warc

92
Stars
20
Forks
Watchers

Parse And Create Web ARChive (WARC) files with node.js

munin-indexer

25
Stars
2
Forks
Watchers

A social media open post web archiving tool

warcworker

53
Stars
9
Forks
Watchers

A dockerized, queued high fidelity web archiver based on Squidwarc

cc-notebooks

40
Stars
8
Forks
Watchers

Various Jupyter notebooks about Common Crawl data