archives icon indicating copy to clipboard operation
archives copied to clipboard

[ARCHIVED] Repo to coordinate archival efforts with IPFS

Results 100 archives issues
Sort by recently updated
recently updated
newest added

Based on the tests in #137 the rabin chunker isn't actually providing any real deduplication benefits. It's also really slow. - [ ] Identify why the rabin chunker is not...

difficulty:moderate

Institutional Collaborators pin the root hash on their ipfs nodes. The nodes replicate all of the data.

ready

Getting documents archived on IPFS is one thing, but we also need to be able to search through them. Given that these archives are eventually going to become too large...

help wanted

on ipfs: QmXJ8KkgKyjRxTrEDvmZWZMNGq1dk3t97AVhF1Xeov3kB4 on dat: 04ed0b08ff595a992a594ad1ab624072646467ec7eda2dc40e4aa512e49cb196 Using [this shell script](https://github.com/substack/peermaps/blob/e0ea8bee9278266b9095df0d49fd40585d8a0d4b/scripts/planet.sh) I've divided planet-latest.osm.pbf into 215836 `.o5m.gz` files (which [osmconvert][1] can read) and 14389 meta.json files. Each `.o5m.gz` file is less...

https://commoncrawl.org/ > We build and maintain an open repository of web crawl data that can be accessed and analyzed by anyone. I'm not sure how much data it is, but...

https://openaddresses.io/ > Address data is essential infrastructure. > > Street names, house numbers and postal codes, when combined with geographic coordinates, are the hub that connects digital to physical places....

I would like to add phrack.org to the archive. It's already pinned on some of my machines https://ipfs.io/ipfs/QmewwUQEnncAvGEnzhqRECgbQw9YcaW6oQMZVRGkPjTcLC

- http://1997.webhistory.org/home.html - `screen bash -c "wget --mirror --convert-links --no-verbose -e robots=off -U 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729)' http://1997.webhistory.org...

The DPC are trying to crowdsource "the DPC’s ‘Bit List’ of Digitally Endangered Species will highlight the need for action to preserve high-value digital content that is critically endangered". http://www.dpconline.org/our-work/digitally-endangered-species...

https://apod.nasa.gov/ would be great for archival, and also help us make more jokes about the inter-planetary nature of IPFS.