Transformers-Tutorials
Transformers-Tutorials copied to clipboard
How to filter only relevant lables in LayoutLMv2
Hi, first of all thanks a lot for your work. I followed along your tutorial on data prep from CORD and fine-tuned on my custom dateset which is pretty similar. I only had annotation to the required labels and no annotation for the "Other" tokens. Now when i predict using my trained model each and every piece of text is predicted some or the other label, when it actually should have been just the predictions for my 5 labels. Is there any way to filter predictions by probability or something ? or do i need to have the other non-relevant tokens annotated as "other" and retrain, if so can i somehow automate the synthesis of this "other" label annotations. And ofcourse this is happening when I am doing true inference.
Hi. There is no Other label for coord dataset as i see
Other appears in iob_to_label function
def iob_to_label(label):
label = label[2:]
if not label:
return 'other'
return label
Upd. Other is label for tokens which wasn't predicted by model. You can just get rid of them by modifiyng some logic in function above(or the logic where that function appears) Or train your model with more samples or more detailed labeling. Anyway there will be other labels in prediction(or non-predicted) while there are probability of failed prediction
Hi @Abhishekvats1997 Did you solved the issue ?