LAVIS
LAVIS copied to clipboard
About itm loss
Thank you for your code! When I reproduce the stage 1 trainging, I find that the itm loss does not convergent, is it normal? Or is there any trick? (note: I replace the bert with xlmr model)