AdvancedLiterateMachinery icon indicating copy to clipboard operation
AdvancedLiterateMachinery copied to clipboard

Geolayoutlm, effect of image

Open HoomanKhosravi opened this issue 1 year ago • 3 comments

Hi thank you for your great work.

I was wondering if you have tested the effect of the image on accuracy of SER and RE resutls? (compared to text and layout) When I overwrite the image to a matrix of zeros, the accuracy of the model seem almost unaffected. seem like the image has no contribution to the final result. I'm looking forward to hear your thoughts on this.

Best, Hooman

HoomanKhosravi avatar Mar 19 '24 23:03 HoomanKhosravi

Hi @HoomanKhosravi, do you have any updates on this? I have similar suspicions but haven't had the chance to replicate your experiment yet.

MayStepanyan avatar Aug 29 '24 07:08 MayStepanyan

@MayStepanyan when running the provided eval, just set the image array to a matrix of zeros. you'll see insignificant change in the outcome

HoomanKhosravi avatar Aug 29 '24 20:08 HoomanKhosravi

I replicated this for my task, i.e. replacing the images with an array of zeros didn't affect the accuracy of the model. I also tried removing every module from GeoLayoutLMVIE except for .text_encoder -> this didn't affect the accuracy either, meaning that the task might be easy enough to solve with text encoder only.

It's important to note I solve a labeling only task, without entity linking.

@Wangsherpa @yashsandansing any clues?

MayStepanyan avatar Sep 09 '24 07:09 MayStepanyan