covidex
covidex copied to clipboard
Evaluate highlighter on SQuAD and BioASQ
The highlighting service is very similar to a Q&A system: given a question and a document, the model outputs a sentence span that might contain the answer.
Hence, we should evaluate different highlighter models (BioBERT, T5, sciT5) on Q&A datasets such as SQuAD and BioASQ.
since it's unsupervised, we can't expect exact match to be high. we should probably use a metric that measures whether the answer span is included in a highlighted sentence.