XLNet support for Reader
Hi, Currently the reader is operating using a pre trained BERT pytorch model. I am wondering whether you plan on adding support for an XLNet pretrained model once it is released for pytorch?
Indeed having a XLNet reader would be a nice addition to cdQA given its results on the QA tasks.
The @huggingface community is currently implementing XLNet in their Transformers repository. See this PR: https://github.com/huggingface/pytorch-pretrained-BERT/pull/711
We are waiting for the official code release to start reverse-engineering the script in order to see if it would be easy to add it to cdQA.
We might need some help though so feel free to comment or PR if you have ideas!
related: https://github.com/renatoviolin/xlnet
@huggingface just released their new update, so we started to explore XLNet for cdQA (PR #205)
Implementation of XLNetForQuestionAnswering is pretty different from BertForQuestionAnswering and the official HF version does not output the logits for now. (cf. https://github.com/huggingface/pytorch-transformers/issues/838)
XLNetForQuestionAnswering uses Beam Search to find the best (and more probable) span, while BertForQuestionAnswering maximises the start_score and end_score separately.
Due to this limitation, the XLNet (and XLM) support for the Reader will take a bit more time than expected.