grobid-quantities
grobid-quantities copied to clipboard
Switch from paragraph based to sentence based
This is a follow up of #115.
To change from paragraph based to sentence based we would need the following:
- implement the segmentation in the quantityparser to reduce the impact of reconstructing the composed entities (e.g. intervals) #87 on large spans of text
- implements the segmentation in the pdf processing, including collecting the references and used them to improve the sentence segmentation output
- convert the training data to sentence-based. This will improve the results with DL as the sequences will be shorter and likely within the maximum length limits