haystack icon indicating copy to clipboard operation
haystack copied to clipboard

Support for answering multi-span questions

Open lalitpagaria opened this issue 3 years ago • 7 comments
trafficstars

Currently Haystack support single space QA. It would be great to add support for multi-span QA. Adding recent research in this field -

http://arxiv.org/abs/1909.13375

lalitpagaria avatar Dec 15 '21 14:12 lalitpagaria

They seem to use multiple prediction heads which we have already implemented in AdaptiveModel. However they use an additional feedforward network to determine which prediction head should be used to generate the answer.

@lalitpagaria have you analyzed this a bit more, what would need to be done here? @julian-risch can you take a look at this. Maybe it's easier than I think...

tstadel avatar Dec 15 '21 15:12 tstadel

Interesting idea. For Table QA, we actually needed multi-span QA too, so @bogdankostic has done some work there.

Section 3 in the paper reminds me of Named Entity Recognition because it is also a BIO Sequence Tagging task. We could make use of our old NER implementation in FARM: https://github.com/deepset-ai/FARM/blob/master/examples/ner.py We didn't migrate it into Haystack because we did not see the need so far. Unfortunately, the implementation of multi-span QA would not only be about the new model but also about evaluation and metric calculation. Further, we would need a dataset and some examples so that users can easily understand how it works.

An additional prediction head as mentioned in Section 4 should not be too complicated, I think.

Are there any specific use cases of multi-span QA that you have in mind already @lalitpagaria ?

julian-risch avatar Dec 16 '21 09:12 julian-risch

@julian-risch @tstadel I don't have any use case. I thought it could be useful to community hence created placeholder ticket. So not an priority thing. 🙂 From design wise I think reader also need changes as it provides single span.

lalitpagaria avatar Dec 16 '21 09:12 lalitpagaria

Agreed, could be definitely useful and would be a nice-to-have feature. Let's put it in our backlog then for now.

julian-risch avatar Dec 16 '21 10:12 julian-risch

Hi @lalitpagaria, how did you annotate your training/ test set with multiple spans for single question? I tried haystack but couldn't do it. My domain requires multi span annotation for answers, which I intend to merge for the final output answer.

Nakkhatra avatar Jan 06 '22 13:01 Nakkhatra

@Nakkhatra I am not the author of the paper. I just shared it here for reference. in case the community likes to contribute.

lalitpagaria avatar Jan 11 '22 14:01 lalitpagaria

I need this feature too. This seems to be an unsolved problem. Here they use token classification: https://discuss.huggingface.co/t/how-to-do-multi-span-question-answering/16291 answers need to be easy token for it to work.

I'm still on the hunt for a better solution

leonardlin avatar Sep 07 '22 20:09 leonardlin

Hi there, I think Im also in need with this feature, since in our cases, multi-span answers are quite common, similar to the table QA secnairo, so it would be great if we could include this asap :)

Brs

stefanqxb avatar Nov 27 '22 07:11 stefanqxb

Any updates for this tickets ? :)

stefanqxb avatar Nov 29 '22 06:11 stefanqxb

Just adding that I too have a use case for this feature. I often have questions which are answered in part in multiple places. It would be great to be able to extract each of the components of a complete answer (like extractive summarization, but focused on summarizing only those parts which are relevant to the question).

bbernicker avatar Jan 20 '23 20:01 bbernicker