FUDGE
FUDGE copied to clipboard
Linking for distant words
Hey @herobd , I was trying to extract the relations for keywords using the pretrained weights and your code.
But for distant boxes it doesnt seem to identify the linking, is there anyway to solve this?
It could either be an artifact of the pretraining data (not having long relationships) or the Swin model having windowed attention. Have you tried fine-tuning on your data?
I couldn't figure out how should the structure of my dataset look like, can you please help me with that? @herobd
Sorry, ignore my prior response about the Swin model attention (I thought this was an issue on Dessurt). The graph should have links that far across, but it's failing to merge the two parts of the key together (e.g. "9 Add lines..." and "9"). Fine tuning is probably a good thing to try still.
You have a few choices with the data:
- Make it look like the NAF or FUNDS data and use one of those dataset loaders
- Write your own child class of datasets/graph_pair.py. This is mostly writing the
parseAnn
function.
Do you have annotations for your data?
Yea I have the annotations for my dataset, but Im unable to finetune it for my dataset.
I even tried to fine tune it for FUNSD dataset by extracting the FUNSD.zip dataset provided in the Readme link, and place it inside “data” folder.
But it throws me an error that value of num_classes should be more than 0 currently is 0
Is the dataset structure to be modified before using it for training? or we can use the FUNSD directly to train it using train.py
What is the config your using? And what is the exact error? (line number)