fromthepage
fromthepage copied to clipboard
CONTENTdm imports should handle pagination better
The following contentdm iiif collection has 84000+ items in it:
{ "@context": "http://iiif.io/api/presentation/2/context.json", "@id": "https://cdm15138.contentdm.oclc.org/iiif/p15138coll54/manifest.json", "@type": "sc:Collection", "label": "Tennessee Death Records", "first": "https://cdm15138.contentdm.oclc.org/iiif/p15138coll54/p1.json", "total": 84173 }
To see them all, there is a paginated IIIF manifest, but our UI would have you selecting 1000 at a time, importing, then selecting the next page of 1000 -- 85 times!
The collection: https://cdm15138.contentdm.oclc.org/iiif/2/p15138coll54/manifest.json
Here's page two of the page collection items https://cdm15138.contentdm.oclc.org/iiif/2/p15138coll54/p2.json
We need a better strategy for these very large collections -- perhaps grabbing the entire collection and processing it? (where and when do we process the pages in the collection manifest?)