haystack
haystack copied to clipboard
Support for answering multi-span questions
Currently Haystack support single space QA. It would be great to add support for multi-span QA. Adding recent research in this field -
http://arxiv.org/abs/1909.13375
They seem to use multiple prediction heads which we have already implemented in AdaptiveModel. However they use an additional feedforward network to determine which prediction head should be used to generate the answer.
@lalitpagaria have you analyzed this a bit more, what would need to be done here? @julian-risch can you take a look at this. Maybe it's easier than I think...
Interesting idea. For Table QA, we actually needed multi-span QA too, so @bogdankostic has done some work there.
Section 3 in the paper reminds me of Named Entity Recognition because it is also a BIO Sequence Tagging task. We could make use of our old NER implementation in FARM: https://github.com/deepset-ai/FARM/blob/master/examples/ner.py We didn't migrate it into Haystack because we did not see the need so far. Unfortunately, the implementation of multi-span QA would not only be about the new model but also about evaluation and metric calculation. Further, we would need a dataset and some examples so that users can easily understand how it works.
An additional prediction head as mentioned in Section 4 should not be too complicated, I think.
Are there any specific use cases of multi-span QA that you have in mind already @lalitpagaria ?
@julian-risch @tstadel I don't have any use case. I thought it could be useful to community hence created placeholder ticket. So not an priority thing. 🙂 From design wise I think reader also need changes as it provides single span.
Agreed, could be definitely useful and would be a nice-to-have feature. Let's put it in our backlog then for now.
Hi @lalitpagaria, how did you annotate your training/ test set with multiple spans for single question? I tried haystack but couldn't do it. My domain requires multi span annotation for answers, which I intend to merge for the final output answer.
@Nakkhatra I am not the author of the paper. I just shared it here for reference. in case the community likes to contribute.
I need this feature too. This seems to be an unsolved problem. Here they use token classification: https://discuss.huggingface.co/t/how-to-do-multi-span-question-answering/16291 answers need to be easy token for it to work.
I'm still on the hunt for a better solution
Hi there, I think Im also in need with this feature, since in our cases, multi-span answers are quite common, similar to the table QA secnairo, so it would be great if we could include this asap :)
Brs
Any updates for this tickets ? :)
Just adding that I too have a use case for this feature. I often have questions which are answered in part in multiple places. It would be great to be able to extract each of the components of a complete answer (like extractive summarization, but focused on summarizing only those parts which are relevant to the question).