Representation-Learning-for-Information-Extraction
Representation-Learning-for-Information-Extraction copied to clipboard
Candidate Selection
Have you experimented with altering the candidate selection process?
I am interested in what occurs when the candidate selection process is simplified or removed entirely so that every possible candidate is evaluated.
I didn't try that as removing it would mean a lot of meaningless negative candidates. For eg: for a date, a floating-point number or random text doesn't make sense. So I didn't try that.
@Praneet9 I have trained this model on structured documents dataset and during the training, validation loss and accuracy were 0.0006 and .98421 respectively, but when I am testing it on the new documents the result is very poor. The trained model is not able to predict those keys for which there are multiple candidates.
I am attaching the snapshot of the documents for amounts information, in which, many keys have same candidate so model is not able to detect the keys.
we have to extract all the key value present in the snapshot.
Just want to confirm one thing, is there are any condition that a single candidate text can't be part of multiple keys?
please can you suggest how we can solve this issue?
@Neelesh1121 Can you please explain what you mean by
we have to extract all the key value present in the snapshot.
There's no rule like that. In the above case, what are you trying to extract for the amount key?
@Praneet9 Hi , i am really confused about positional embeddings, Do we have to collect every invoice relative positions of neighbors in the training set , then train the model to generate embedding for a 2 D coordinates like embedding([3,4])= [2.34,3.43,2.34........] Or is it done by training one invoice at a time, but the embeddings generated for a particular relative position would be different each time while training for new invoice.
@panwar2001 Can you elaborate with an example of what issue you are facing?
@panwar2001 I think you are confusing that projection with the neighbour encoding. The neighbour encodings are projected to 4 * 2d and then maxpooled as you can see here. These tensors are then concatenated with candidate embeddings, which are then projected back down. Hope that answers your confusion.