grobid-quantities icon indicating copy to clipboard operation
grobid-quantities copied to clipboard

Switch from paragraph based to sentence based

Open lfoppiano opened this issue 2 years ago • 0 comments

This is a follow up of #115.

To change from paragraph based to sentence based we would need the following:

  • implement the segmentation in the quantityparser to reduce the impact of reconstructing the composed entities (e.g. intervals) #87 on large spans of text
  • implements the segmentation in the pdf processing, including collecting the references and used them to improve the sentence segmentation output
  • convert the training data to sentence-based. This will improve the results with DL as the sequences will be shorter and likely within the maximum length limits

lfoppiano avatar May 24 '22 23:05 lfoppiano