Daniel van Strien
Daniel van Strien
> Well, that would mainly be me as I was coordinator of the project where the data was produced :) I have also been working with/been in contact with ~20-30...
> Great! I don't want to overload this with things from the past, but I think this would present a great opportunity to capture and document some of the background...
Notes for discussion: ## Background - Background documentation - Are ALTO formats consistent across collections? ## Documentation - What to document ## source - API or bulk downloads (https://pro.europeana.eu/page/iiif#download) ##...
@bmschmidt @cneud @stefan-it Just to let you know, I am currently putting some processing code together for this. I'm essentially Frankensteinining the code you all shared already. I'll hopefully have...
Thanks for this, @stefan-it. I have the alto parsing done (adapting code from @cneud) but feel free to share if it's ready anyway :) For the metadata, I'm currently getting...
> > Where `==` is mistakenly used as language identifier?! I'm shocked you don't speak `==` 😜
Example of metadata from dump: ```xml {'rdf:RDF': {'@xmlns:cc': 'http://creativecommons.org/ns#', '@xmlns:dc': 'http://purl.org/dc/elements/1.1/', '@xmlns:dcterms': 'http://purl.org/dc/terms/', '@xmlns:doap': 'http://usefulinc.com/ns/doap#', '@xmlns:edm': 'http://www.europeana.eu/schemas/edm/', '@xmlns:foaf': 'http://xmlns.com/foaf/0.1/', '@xmlns:ore': 'http://www.openarchives.org/ore/terms/', '@xmlns:owl': 'http://www.w3.org/2002/07/owl#', '@xmlns:rdf': 'http://www.w3.org/1999/02/22-rdf-syntax-ns#', '@xmlns:rdfs': 'http://www.w3.org/2000/01/rdf-schema#', '@xmlns:skos': 'http://www.w3.org/2004/02/skos/core#', '@xmlns:svcs':...
I am clarifying the licence for this, see https://github.com/Odeuropa/benchmarks_and_corpora/issues/3 so would hold off working on this until we've got that info back.
@giganttheo, thanks for suggesting this. I think it's a super interesting dataset, but I have a few questions about how we could access this dataset. On the [terms of use](https://www.oldbookillustrations.com/terms-of-use/)...
> Yeah you are right. I just sent an email to the contact adress from the website, to ask for a mirror or a special authorization to use a scraping...