long-summarization icon indicating copy to clipboard operation
long-summarization copied to clipboard

Non tokenized and cased dataset

Open zide05 opened this issue 4 years ago • 1 comments

Where can i get the non tokenized and cased Arxiv and Pubmed dataset?

zide05 avatar May 16 '20 10:05 zide05

The datasets include paper ids, if you'd like the raw data it should be easy to fetch those from pubmed open access and arxiv. If you ended up doing that, a PR on adding a link to the collected raw data is very welcome.

armancohan avatar Jun 27 '20 17:06 armancohan