grobid
grobid copied to clipboard
Add training data for one special error case
This PR provides the training data to improve the recognition related to error case #632. I've added in the guidelines a couple of additional sentences related to the author contribution as discussed.
Where I was not sure, I commented directly in the XML files.
Hello Luca ! Would it be possible to send me the PDF for these training cases?
This is just one document. The PDF was linked in the issue: https://www.nature.com/articles/s41598-020-58065-9.pdf
Raw header file need to be regenerated after retraining of the segmentation model, because some content has been added in the segmentation file.
I think it would be better here to roll-back the changes in the guidelines because they will conflict will most recent ones done in other place and they are redundant.
It might be appropriate to merge this PR with branch feature/data-availability-statement
first and not with master, because the changes relatively to training in different branches here make things difficult to follow and update in a consistent manner (the raw files and model in particular).
Thanks for the review! I close this and add these files (without guidelines) directly in the feature/data-availability-statement
.
closing because it is moved into feature/data-availability-statement