lam icon indicating copy to clipboard operation
lam copied to clipboard

Add dataset: gutenberg_poetry_corpus

Open cakiki opened this issue 2 years ago • 1 comments

A URL for this dataset

https://github.com/aparrish/gutenberg-poetry-corpus

Dataset description

Will be useufl for computational poetry

Dataset modality

Text

Dataset licence

Creative Commons Zero v1.0 Universal

Other licence

No response

How can you access this data

As a download from a repository/website

size of dataset

<500MB

Confirm the dataset has an open licence

  • [X] To the best of my knowledge, this dataset is accessible via an open licence

Contact details for data custodian

https://github.com/aparrish

cakiki avatar Oct 18 '22 10:10 cakiki

Dataset has been added here: https://huggingface.co/datasets/biglam/gutenberg-poetry-corpus

TODOS

  • [x] complete the dataset card

  • [ ] crossreference the gutenberg ID column to actual metadata (Optional)

cc @davanstrien

cakiki avatar Oct 18 '22 10:10 cakiki