Representation-Learning-for-Information-Extraction
Representation-Learning-for-Information-Extraction copied to clipboard
Pytorch implementation of Paper by Google Research - Representation Learning for Information Extraction from Form-like Documents.
The linear projection after the self attention: `bs = self_attention.size(0)` `self_attention = self_attention.view(bs, -1)` `linear_proj = F.relu(self.linear_projection(self_attention))` From the paper, they said "We project the self-attended neighbor encodings to a...
Have you experimented with altering the candidate selection process? I am interested in what occurs when the candidate selection process is simplified or removed entirely so that every possible candidate...
Saving candidates ( fix #26 ) Please review @CS-savvy
Please review for relevance
why files not found in candidates directory when training is started?
get_tesseract_results needs a path
The whole system can be dockerized for an easier setup procedure for training or inferencing.
data:image/s3,"s3://crabby-images/7e95c/7e95c4b0d85139f27056970abf8f75e2945da923" alt="image" My candidate recall is already as high as 0.95, but as you can see during training, the recall on the validation set is very low.
@Praneet9 The NER to extract address candidates is having accuracy issue and difficult to separate multiple address. Do you know any way , how to train model , example bert...
I have noticed that in train.py and eval.py you have import a FocalLoss class which is then used as criterion for both training and evaluating. But I couldn't find relevant...