grobid icon indicating copy to clipboard operation
grobid copied to clipboard

Fix heading annotation in fulltext evaluation and add header levels

Open Schroedi opened this issue 10 months ago • 3 comments

<head>Methods<lb/> Preparation<lb/></head> are actually two headings in the original paper.

Schroedi avatar Apr 25 '24 19:04 Schroedi

@Schroedi Great.

Did you check also the segmentation model training output files? If they were perfectly fine we should not include them, but if they had to be corrected, then we should also add them.

See comment here: https://github.com/kermitt2/grobid/issues/1067#issuecomment-1888503015

lfoppiano avatar Apr 26 '24 00:04 lfoppiano

The segmentation looked fine to me.

Could you share a folder with the PDFs the (fulltext) training data are from? I am currently fetching them one by one.

Schroedi avatar Apr 29 '24 14:04 Schroedi

hi @Schroedi please send me an email (https://grobid.readthedocs.io/en/latest/Introduction/#credits) and I will send you back the info for accessing the PDF repository used for the training data.

kermitt2 avatar Apr 30 '24 18:04 kermitt2