Representation-Learning-for-Information-Extraction icon indicating copy to clipboard operation
Representation-Learning-for-Information-Extraction copied to clipboard

Pytorch implementation of Paper by Google Research - Representation Learning for Information Extraction from Form-like Documents.

Results 12 Representation-Learning-for-Information-Extraction issues
Sort by recently updated
recently updated
newest added

The linear projection after the self attention: `bs = self_attention.size(0)` `self_attention = self_attention.view(bs, -1)` `linear_proj = F.relu(self.linear_projection(self_attention))` From the paper, they said "We project the self-attended neighbor encodings to a...

Have you experimented with altering the candidate selection process? I am interested in what occurs when the candidate selection process is simplified or removed entirely so that every possible candidate...

Saving candidates ( fix #26 ) Please review @CS-savvy

Please review for relevance

why files not found in candidates directory when training is started?

get_tesseract_results needs a path

The whole system can be dockerized for an easier setup procedure for training or inferencing.

deployment

![image](https://github.com/Praneet9/Representation-Learning-for-Information-Extraction/assets/39401819/9172276a-7a5b-4edf-9470-fbc7a5d0c3e3) My candidate recall is already as high as 0.95, but as you can see during training, the recall on the validation set is very low.

@Praneet9 The NER to extract address candidates is having accuracy issue and difficult to separate multiple address. Do you know any way , how to train model , example bert...

I have noticed that in train.py and eval.py you have import a FocalLoss class which is then used as criterion for both training and evaluating. But I couldn't find relevant...