LAVIS icon indicating copy to clipboard operation
LAVIS copied to clipboard

About itm loss

Open qibao77 opened this issue 1 year ago • 5 comments

Thank you for your code! When I reproduce the stage 1 trainging, I find that the itm loss does not convergent, is it normal? Or is there any trick? (note: I replace the bert with xlmr model)

qibao77 avatar Apr 24 '23 08:04 qibao77

No trick here. You may try lower learning rate, or use cleaner datasets.

dxli94 avatar Apr 24 '23 09:04 dxli94

When I pretrain BLIP on chinese dataset, I also met this question, the ITM and LM loss do not converge. Have you solve this problem? @qibao77

chenyzh28 avatar May 09 '23 07:05 chenyzh28

@qibao77 We use customised implementation for the mixture of encoder-decoder (med.py) model. It has a different architecture to that of bert, even though it is initialized from bert weights. If xlmr is used, there needs to be a customised implementation as well.

LiJunnan1992 avatar May 09 '23 11:05 LiJunnan1992

@chenyzh28 My ITM loss has converged, but it doesn't seem to work, and I'm still working on it.

qibao77 avatar May 09 '23 15:05 qibao77

@qibao77 It seems that the difference between Chinese and English BERT did cause my problem. I lowered the learning rate and ITM and LM are currently converging normally. I test the model with ITM score and it outperforms the ITC significantly. I hope it helps you.

chenyzh28 avatar May 10 '23 07:05 chenyzh28

can you please share more detail here? @chenyzh28

by Chinese bert, do you mean that you change the vocab or something else? what the new learning rate did you use to make it converge?

ldfandian avatar Jul 18 '23 03:07 ldfandian