PICK-pytorch sroie results

Hello,

I trained your model on sroie. During training I got following:

| name | mEP | mER | mEF | mEA | +=========+==========+==========+==========+==========+ | company | 0.887363 | 0.904762 | 0.895978 | 0.904762 | +---------+----------+----------+----------+----------+ | address | 0.947084 | 0.950163 | 0.948621 | 0.950163 | +---------+----------+----------+----------+----------+ | total | 0.804009 | 0.897266 | 0.848081 | 0.897266 | +---------+----------+----------+----------+----------+ | date | 0.981878 | 0.996656 | 0.989212 | 0.996656 | +---------+----------+----------+----------+----------+ | overall | 0.900719 | 0.937126 | 0.918562 | 0.937126 |

But when I run it on test set I get pretty bad results. for example total is missing a lot. Looks like this for example;

company KAISON FURNISHING SDN BHD,company address L4-17 (B), LEVEL 4,address address UP2-01, MELAWATI MALL,address address 355, JALAN BANDAR MELAWATI,address address PUSAT BANDAR MELAWATI,address address 53100 KUALA LUMPUR.,address date 29-01-18 address 2,305.80 SR,other address 3 total ,33 address 6.00 SR,othe address 2,197.00 SR,other address 7,838.80,other address -7,840.00,other address 7,395.09,other address 7,838.80,other

This one is even on training set example.

Jan 19 '21 17:01 juvebogdan

There is two problems there; first, you must remove the categories (last column) from the tsv input or pick will get confused; second, SROIE dataset has many transcript errors, the training and prediction end up very messed up because of them.

Jan 19 '21 17:01 compadrejavo

I had the same problem of not removing the last column. I became desperate until I was able to realize it. I'm glad I wasn't the only one ... ahahahha

Jan 19 '21 17:01 jorgerodriguezsj

Oh. Thank you. I tried removing it. But if I remove it just from tsv files then I am getting some errors just at the start of training. Do I remove this in actual tsv files during preprocess or somewhere else?

Jan 19 '21 21:01 juvebogdan

@juvebogdan Be careful because you have to remove them only from those you use for inference. That is, only those that you pass to the test.py file.

Jan 19 '21 22:01 jorgerodriguezsj

I understand. Thank you very much

Jan 19 '21 22:01 juvebogdan

I think i need to change keys.txt file as well. Is this required?

Jan 20 '21 09:01 juvebogdan

No, it is not necessary. Take a look at the arguments that test.py needs

Checkpoint
Boxes and transcripts (Without the tag column) of the images wich you want to get the info
Path of the folder in which are the images from which you want to get the information.
Path of the folder where you want to save the output results of each image
GPU id to use
Batch size

Therefore you only need the images and the boxes and transcripts (Without the tag column)

Jan 20 '21 10:01 jorgerodriguezsj

@juvebogdan May I ask how you got such a high number? After 100 epochs, I got these numbers only

+---------+----------+----------+----------+----------+
| name    |      mEP |      mER |      mEF |      mEA |
+=========+==========+==========+==========+==========+
| total   | 0.504762 | 0.550173 | 0.52649  | 0.550173 |
+---------+----------+----------+----------+----------+
| address | 0.60628  | 0.394035 | 0.47764  | 0.394035 |
+---------+----------+----------+----------+----------+
| company | 0.564706 | 0.571429 | 0.568047 | 0.571429 |
+---------+----------+----------+----------+----------+
| date    | 0.877551 | 0.914894 | 0.895833 | 0.914894 |
+---------+----------+----------+----------+----------+
| overall | 0.610822 | 0.509991 | 0.555871 | 0.509991 |
+---------+----------+----------+----------+----------+

Apr 03 '21 03:04 minhhoangbui

@juvebogdan May I ask how you got such a high number? After 100 epochs, I got these numbers only

+---------+----------+----------+----------+----------+
| name    |      mEP |      mER |      mEF |      mEA |
+=========+==========+==========+==========+==========+
| total   | 0.504762 | 0.550173 | 0.52649  | 0.550173 |
+---------+----------+----------+----------+----------+
| address | 0.60628  | 0.394035 | 0.47764  | 0.394035 |
+---------+----------+----------+----------+----------+
| company | 0.564706 | 0.571429 | 0.568047 | 0.571429 |
+---------+----------+----------+----------+----------+
| date    | 0.877551 | 0.914894 | 0.895833 | 0.914894 |
+---------+----------+----------+----------+----------+
| overall | 0.610822 | 0.509991 | 0.555871 | 0.509991 |
+---------+----------+----------+----------+----------+

I suppose you should try early stop method

May 06 '22 14:05 HoKinChung

I think you ended up with an overfitting problem, how many images did you use for train/test data ?

May 06 '22 14:05 ziodos

PICK-pytorch PICK-pytorch copied to clipboard

sroie results

PICK-pytorch
PICK-pytorch copied to clipboard