Daniel van Strien comments

Results 138 comments of


                                            Daniel van Strien

Add dataset: old_book_illustrations

Awesome, the only other I would check is that when you download the images we can get sufficient metadata for each image to verify the licence/copyright. What information is downloaded...

Add dataset: old_book_illustrations

Great that looks good. I think if we can include the source information/URL that would be great. My own preference would also to be include as much information as possible...

Add dataset: old_book_illustrations

> Last night, I scraped the pages from the website, by following the restrictions agreed upon. This is the resulting dataset, stored on the hub: [huggingface.co/datasets/gigant/oldbookillustrations_2](https://huggingface.co/datasets/gigant/oldbookillustrations_2) > > Do you...

Add dataset: old_book_illustrations

Thanks so much for this. Having given this a bit more thought, I think it probably makes sense to try and filter out the items which may have copyright issues....

Add dataset: old_book_illustrations

> Thanks so much for this. Having given this a bit more thought, I think it probably makes sense to try and filter out the items which may have copyright...

Add dataset: WWI_documents_dataset

This sounds great, thanks for suggesting it! If you also want to work on adding this feel free to use the `#self-assign` command to assign yourself to work on this.

Add dataset: distantreader

@ericleasemorgan I thought we could use this issue to discuss further best approach for this dataset :)

Add dataset: distantreader

I'll try and take a closer look at this again next week but some initial thoughts below: >Just to re-iterate, the next step is for me to write a little...

bnl_ground_truth_newspapers_before_1878

> Moved the dataset to the biglam organisation [biglam/bnl_ground_truth_newspapers_before_1878](https://huggingface.co/biglam/bnl_ground_truth_newspapers_before_1878) I think this got created as a model, so I've just moved it to a dataset. I think it could also...

Add dataset: images_de_la_revolution_francaise

Looks good. We could maybe also think about adding some more general guidance on working with IIIF images/manifests. I have some code for parsing manifest which I can try and...