Mitra Ardron
Mitra Ardron
Done: ./crawl.js --level all zandvoort.newspapers.1992.zandvoorts.nieuwsblad but it missed the big files (>700Mb for the zip)
(Note to self - see EN/Dweb - Archive - Text)
An example of a text item with multiple "books" try https://archive.org/details/ialerequestsummary Books are one page
EDITED: Background info: Multipage books thetaleofpeterra14838gut or alicesadventures19033gut are reasonably small but are displaying as a slide carousel [https://archive.org/search.php?query=mediatype:texts%20AND%20imagecount:8] shows small ones and unitednov65unit is an example
[ ] Figure out what switches slide carousel or bookreader
From Jeff Kaplan: typically if an item is `mediatype=texts` and there is an abby and pdf file then it will result in a bookreader presentation. loose images would not result...
See - #109 for failure case (Peter Rabbit) that should use slide carousel
Thanks for the thinking .. I'm hoping we can use this exercise as an example of how DAT might work with a *large* and changing site like the Archive. IMHO...
@pfrazee asked "Are you certain that sharding the dataset to one-dat-per-folder, and only updating folder-dats that have peers, is not enough for you to scale?” Sure - this can be...
@martinheidegger - there are two possible approaches here - one DAT for the whole of the IA, or one DAT per item with a root-level DAT as an inde, both...