PICK-pytorch icon indicating copy to clipboard operation
PICK-pytorch copied to clipboard

The model may overfit the text sorting method, resulting in ineffective use of geometric and morphological information

Open phybrain opened this issue 4 years ago • 3 comments

phybrain avatar Oct 12 '20 09:10 phybrain

Hi @phybrain , would you please give us more details about your thought?

tengerye avatar Oct 12 '20 10:10 tengerye

Hi, thank you for your insight and problem.

We have tried to do experiments on the unsorted text, and the performance didn't drop.

The main aim of the text sorting method is to prevent truncation operation (MAX_BOXES_NUM ) from deleting useful information when the area of the top-left document has our interested entity in some situation.

wenwenyu avatar Oct 23 '20 04:10 wenwenyu

Thank you for your reply, the score drop a little when i trained on unsorted text。However, from the actual effect, it is not satisfying,and worse than sorted especially the digital type. It may be because there are a lot of digital types in my entity types,and digital types are uniform distribution.

phybrain avatar Oct 23 '20 08:10 phybrain