COVID-QA icon indicating copy to clipboard operation
COVID-QA copied to clipboard

Where do I get the document subset of Cord-19 used for covid-qa

Open jdpsen opened this issue 2 years ago • 1 comments

The paper mentions "We selected 147 scientific articles mostly related to COVID-19 from the CORD-19" . How can I get the subset of documents to create an index ?

jdpsen avatar Sep 23 '22 12:09 jdpsen

You can convert the QA dataset into the documents used. Here you find the QA dataset: https://github.com/deepset-ai/COVID-QA/blob/master/data/question-answering/200423_covidQA.json

In this JSON there are fields called "context" where the document texts are.

For what do you want to create an index? Are you using Haystack for creating a searchable index?

Timoeller avatar Sep 23 '22 13:09 Timoeller