unilm
unilm copied to clipboard
In XFUND dataset why B-QUESTION", "B-ANSWER", "B-HEADER", "I-ANSWER", "I-QUESTION", "I-HEADER
Describe Model I am using (UniLM, MiniLM, LayoutLM ...):
In XFUND dataset, there are only 4 classes QUESTION, ANSWER, HEADER,OTHER but in
https://github.com/microsoft/unilm/blob/42100e11bdd3ac8e9ca2e9b506af8c9231a0c6d6/layoutlmft/layoutlmft/data/datasets/xfun.py#L48
there are 7 classes.
Not able to understand 7 classes instead of 4 classes. KIndly help
@Dod-o o you have any answer to the above question? kindly let me know
@ChidanandKumarVimaan , This is 'BIO' tagging scheme (for token classification or NER task) , So each tag has "Begin" , "Inside", "Other" , So in total 7 classes.
@abhibisht89 Thanks got it