transformers [WIP] Add UDOP models

trafficstars

#20650

Jan 22 '23 05:01 raghavanone

The documentation is not available anymore as the PR was closed or merged.

Jan 22 '23 06:01 HuggingFaceDocBuilderDev

@sgugger @NielsRogge The model weights are here https://huggingface.co/ZinengTang/Udop/tree/main , But how to get the config for these models ?

Jan 26 '23 05:01 raghavanone

@raghavanone For reference, someone asked the same question on the UDOP repo: https://github.com/microsoft/i-Code/issues/17

Jan 26 '23 15:01 logan-markewich

Note: Cannot proceed further without microsoft releasing the entire weights. Currently vision decoder weights have not been released.

Jan 27 '23 13:01 raghavanone

If I'm not mistaken, vision decoder weights should not be needed when using the text layout decoder part, only. vision_encoder weights are part of the shared model weights.

Jan 27 '23 15:01 maxjeblick

@raghavanone is there anything else blocking? It sounds like we can proceed with the given weights, assuming that we notify users that the vision decoder is not trained.

Feb 03 '23 15:02 logan-markewich

@logan-markewich Yes, I will work on closing this within couple of days .

Feb 04 '23 07:02 raghavanone

@sgugger Need some pointers on How should this model be tested ? Can I follow the tests used for T5 model and replicate similar tests ?

Feb 04 '23 15:02 raghavanone

@NielsRogge Any pointer here ?

Feb 07 '23 03:02 raghavanone

I hope it gets merged soon @raghavanone . Nice work :)

Feb 09 '23 12:02 WaterKnight1998

Forgive my naiveté, why do all the tests call from_pretrained() on some variation of t5? The UDOP model checkpoints are here. Could these be used?

Feb 22 '23 19:02 plamb-viso

Ah, I see that the test script they provide also uses T5-large, I expected it to use one of those checkpoints

Feb 22 '23 19:02 plamb-viso

@raghavanone how are things going with this so far? I'm very interested in using this model as soon as it gets integrated - if you need a hand with anything let me know! And thanks for bringing it into the library 😄

Mar 04 '23 16:03 thefirebanks

@raghavanone how are things going with this so far? I'm very interested in using this model as soon as it gets integrated - if you need a hand with anything let me know! And thanks for bringing it into the library 😄

@thefirebanks I am working on fixing last few tests. Hoping to close this PR very soon. Sorry for the delay.

Mar 06 '23 08:03 raghavanone

@raghavanone I am currently trying to finetune UdopUniModelForConditionalGeneration using this PR. I ran into the following exception while training:

 File "/opt/conda/lib/python3.8/site-packages/transformers/models/udop/modeling_udop.py", line 2422, in forward
 encoder_outputs = self.encoder(
 TypeError: forward() got an unexpected keyword argument 'ids_keep'`

I explained what appears to be happening in this comment.

It looks like the ids_keep parameter was removed from UdopUniStack but not removed from the call to it in UdopUniModelForConditionalGeneration

EDIT Looks like output_attentions, also needs to be removed And in the self.decoder() call, cross_attn_head_mask, output_attentions

Happy to make the changes myself with repo permissions

Mar 06 '23 22:03 plamb-viso

@raghavanone I am currently trying to finetune UdopUniModelForConditionalGeneration using this PR. I ran into the following exception while training:
 File "/opt/conda/lib/python3.8/site-packages/transformers/models/udop/modeling_udop.py", line 2422, in forward
 encoder_outputs = self.encoder(
 TypeError: forward() got an unexpected keyword argument 'ids_keep'`
I explained what appears to be happening in this comment.

It looks like the ids_keep parameter was removed from UdopUniStack but not removed from the call to it in UdopUniModelForConditionalGeneration

EDIT Looks like output_attentions, also needs to be removed And in the self.decoder() call, cross_attn_head_mask, output_attentions

Happy to make the changes myself with repo permissions

@plamb-viso Yes, removing those parameters were not done in all places, I have fixed it locally. I am working on fixing failing tests. This the last step pending for merging. Fixing these tests are taking more time than expected.

Mar 07 '23 07:03 raghavanone

@raghavanone I saw you closed this PR. Skimming over your work, the PR seemed to be in a rather good state. Where there any blockers you encountered? IMO, it would be nice to add UDOP models in Hugginface at some point.

Mar 13 '23 15:03 maxjeblick

@maxjeblick @NielsRogge feels that the code original repo is bit hacky, he is working a separate PR to UDOP in better implementation, so closed this in consultation with him. He should open a PR soon .

@NielsRogge please do add more details for the benefit of folks following this PR

Mar 13 '23 15:03 raghavanone

Thanks a lot for the fast reply!

Mar 13 '23 15:03 maxjeblick

@NielsRogge @raghavanone please link the new PR when its available for people subscribed to this one

Mar 13 '23 16:03 plamb-viso

Hi yes I'll open a PR soon! Thanks a lot for your work already @raghavanone, will ping you on the PR

Mar 13 '23 16:03 NielsRogge

Hi @NielsRogge I saw the large amount of commits on your new UDOP branch, curious if you have any idea on when you think a PR might be ready

Mar 24 '23 16:03 plamb-viso

Sorry to keep hammering on this, but again have noticed a flurry of activity on that branch then almost 2 weeks off. Curious what the plan is for it @NielsRogge

Apr 19 '23 14:04 plamb-viso

Hi @plamb-viso sorry for the late reply, the model is working, only have limited time to work on it. I'll open a PR this weekend/Monday.

For now you can already use the model if you're curious, check this code example regarding usage. Model is already on the hub here.

Apr 21 '23 20:04 NielsRogge

Out of curiosity @NielsRogge : did you ever use your implementation to fine tune it on a task like CORD?

Apr 22 '23 12:04 dtiarks

I've fine-tuned the model on a toy dataset of RVL-CDIP, works well but the model is pretty heavy, got OOM on Google Colab even with batch size = 1 so had to use a bigger GPU. The author only released large variants.

Apr 22 '23 15:04 NielsRogge

In my original work on @raghavanone 's version of the model, I also had to use a batch size of 1 to get it to not OOM on 40gb GPUs

Apr 22 '23 18:04 plamb-viso

transformers transformers copied to clipboard

[WIP] Add UDOP models

transformers
transformers copied to clipboard