COVID-QA
COVID-QA copied to clipboard
Where do I get the document subset of Cord-19 used for covid-qa
The paper mentions "We selected 147 scientific articles mostly related to COVID-19 from the CORD-19" . How can I get the subset of documents to create an index ?
You can convert the QA dataset into the documents used. Here you find the QA dataset: https://github.com/deepset-ai/COVID-QA/blob/master/data/question-answering/200423_covidQA.json
In this JSON there are fields called "context" where the document texts are.
For what do you want to create an index? Are you using Haystack for creating a searchable index?