metadata icon indicating copy to clipboard operation
metadata copied to clipboard

(WIP) Add loading script for arxiv dataset

Open cccntu opened this issue 3 years ago • 0 comments

the dataset can be downloaded here:

wget https://huggingface.co/datasets/ttj/metadata_arxiv/resolve/main/v0.jsonl

I haven't hooked it up with input_pipeline, so it's not runnable now. Because I can't decide right now whether to refactor with_metadata.py or just copy and modify it like in this PR. What do you think? @timoschick @SaulLu

cccntu avatar Aug 05 '21 13:08 cccntu