webarchive topic

List webarchive repositories

ArchiveSpark

141
Stars
19
Forks
Watchers

An Apache Spark framework for easy data processing, extraction as well as derivation for web archives and archival collections, developed at Internet Archive.

node-warc

92
Stars
20
Forks
Watchers

Parse And Create Web ARChive (WARC) files with node.js

WebHackUrls

36
Stars
6
Forks
Watchers

Simple python OSINT tool for urls recon thanks to the waybackmachine.

devilfish

21
Stars
1
Forks
Watchers

A utility for simultaneously creating full-page PDF snapshots and web archives of web pages in DEVONthink Pro.

gogetcrawl

132
Stars
15
Forks
Watchers

Extract web archive data using Wayback Machine and Common Crawl

python-webarchive

44
Stars
4
Forks
Watchers

Create WebKit/Safari .webarchive files on any platform

Seeder

15
Stars
2
Forks
Watchers

Seeder - Czech webarchive curating tool and public site

chatnoir-resiliparse

45
Stars
8
Forks
Watchers

A robust web archive analytics toolkit