grobid
grobid copied to clipboard
Possibility of only annotating needed parts
Hi, I'm wondering if there's a way to change GROBID to only give labels to the parts we specify (i.e. abstract, title) and change the ground truth of the training data to only include those specified labels? Do you think this is doable and will that negatively affect the training result?
Hello @m485liuw !
My experience so far is that it would impact negatively accuracy of the remaining labels:
https://github.com/kermitt2/grobid/issues/777#issuecomment-870170270
I actually introduced in Grobid extra labels only for improving the core ones, and did the same in other sequence labelling projects.
What is your motivation for doing this?
Hi, Thanks for the reply. The motivation is we only care about improving some of the labels, thus don't wanna waste time annotating the others. But ya, you said you introduced the other labels also just for improving the core ones. So I guess this is the best way.