Representation-Learning-for-Information-Extraction issues

About the dimension projection

1

The linear projection after the self attention: `bs = self_attention.size(0)` `self_attention = self_attention.view(bs, -1)` `linear_proj = F.relu(self.linear_projection(self_attention))` From the paper, they said "We project the self-attended neighbor encodings to a...

shaonanqinghuaizongshishi

Candidate Selection

6

Have you experimented with altering the candidate selection process? I am interested in what occurs when the candidate selection process is simplified or removed entirely so that every possible candidate...

bradfox2

Update generate_tesseract_results.py

Saving candidates ( fix #26 ) Please review @CS-savvy

darsh169

Create candidates_to_json.py

1

Please review for relevance

darsh169

Candidates directory

5

why files not found in candidates directory when training is started?

darsh169

Update inference.py

get_tesseract_results needs a path

darsh169

Create docker file for easier setup/deployment

2

The whole system can be dockerized for an easier setup procedure for training or inferencing.

Praneet9

deployment

Why is the recall very low but the precision high during training?

![image](https://github.com/Praneet9/Representation-Learning-for-Information-Extraction/assets/39401819/9172276a-7a5b-4edf-9470-fbc7a5d0c3e3) My candidate recall is already as high as 0.95, but as you can see during training, the recall on the validation set is very low.

reBiocoder

how to work on address candidates?

2

@Praneet9 The NER to extract address candidates is having accuracy issue and difficult to separate multiple address. Do you know any way , how to train model , example bert...

panwar2001

Provide loss function (FocalLoss)

1

I have noticed that in train.py and eval.py you have import a FocalLoss class which is then used as criterion for both training and evaluating. But I couldn't find relevant...

hienpham15

Representation-Learning-for-Information-Extraction
Representation-Learning-for-Information-Extraction copied to clipboard

Metadata

About the dimension projection

Candidate Selection

Update generate_tesseract_results.py

Create candidates_to_json.py

Candidates directory

Update inference.py

Create docker file for easier setup/deployment

Why is the recall very low but the precision high during training?

how to work on address candidates?

Provide loss function (FocalLoss)

← Metadata

Owner

Metadata

Representation-Learning-for-Information-Extraction Representation-Learning-for-Information-Extraction copied to clipboard

Metadata

← Metadata

Owner

Metadata

Representation-Learning-for-Information-Extraction
Representation-Learning-for-Information-Extraction copied to clipboard