neuralqa
neuralqa copied to clipboard
Revise Scoring and Answer Span Selection Method
- [ ] Currently score is a sum of the start and end token probabilities. This might not be optimal.
- [ ] Current there is no text token preprocessing (e.g. strip spaces, remove
\netc) which might introduce unexpected behaviour. - [ ] Currently, answer span is selected as highest proabability (start/end). There is opportunity to do better.
Resources
- https://github.com/huggingface/transformers/blob/master/src/transformers/pipelines.py#L1395