PICK-pytorch
PICK-pytorch copied to clipboard
Have you tried on FUNSD data set
Hi, thanks for the great work and sharing the code. I'm wondering if you've tried on FUNSD dataset. FUNSD is always our baseline for benchmarking and another reason I didn't use SROIE is that LayoutLMV2 showed a big jump on FUNSD's performance but not SROIE's. Anyway, right now with all the default settings I only got 0.60 F1 on FUNSD and I guess there must be something wrong. So I'm reaching out to see if I have the luck that you happen to test on FUNSD and can share the F1, so that I could know where I'm heading.
These are my results on FUNSD dataset using PICK with default settings but increased MAX_BOXES_NUM to 220
+----------+----------+----------+----------+----------+
| name | mEP | mER | mEF | mEA |
+==========+==========+==========+==========+==========+
| answer | 0.68379 | 0.704373 | 0.693929 | 0.704373 |
+----------+----------+----------+----------+----------+
| header | 0.467577 | 0.373297 | 0.415152 | 0.373297 |
+----------+----------+----------+----------+----------+
| question | 0.601961 | 0.646316 | 0.62335 | 0.646316 |
+----------+----------+----------+----------+----------+
| overall | 0.639153 | 0.660393 | 0.6496 | 0.660393 |
+----------+----------+----------+----------+----------+
Yes, it seems to me that PICK can perform well on small size / sort of image-rich documents but not text-rich documents. On Wed, Sep 8, 2021 at 2:24 AM Florian Bussmann @.***> wrote:
These are my results on FUNSD dataset using PICK with default settings but increased MAX_BOXES_NUM to 220
+----------+----------+----------+----------+----------+ | name | mEP | mER | mEF | mEA | +==========+==========+==========+==========+==========+ | answer | 0.68379 | 0.704373 | 0.693929 | 0.704373 | +----------+----------+----------+----------+----------+ | header | 0.467577 | 0.373297 | 0.415152 | 0.373297 | +----------+----------+----------+----------+----------+ | question | 0.601961 | 0.646316 | 0.62335 | 0.646316 | +----------+----------+----------+----------+----------+ | overall | 0.639153 | 0.660393 | 0.6496 | 0.660393 | +----------+----------+----------+----------+----------+
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/wenwenyu/PICK-pytorch/issues/81#issuecomment-914954245, or unsubscribe https://github.com/notifications/unsubscribe-auth/AL2G6PLZE4WN5UKPMFWGXEDUA36SRANCNFSM4XC6CVHQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
In my opinion,pick is not suitable for the funsd dataset. PICK is designed to extract key-value pairs in a one-to-one correspondence.