unilm icon indicating copy to clipboard operation
unilm copied to clipboard

Unable to reproduce DIT pre-training.

Open senthil-r-10 opened this issue 1 year ago • 2 comments

I am trying to reproduce the DIT model mentioned in the paper. using DALL-E encoder for image tokenizer without fine-tuning it with the IIT-CDIP dataset. took 1M document for training but the model is not converging as the loss is stagnated at 4.19807. does anyone tried to reproduce the model and changed any settings mentioned in the paper?

senthil-r-10 avatar Aug 02 '23 12:08 senthil-r-10

I have followed the steps https://github.com/NielsRogge/Transformers-Tutorials/blob/master/BEiT/Understanding_BeitForMaskedImageModeling.ipynb mentioned in the notebook.

senthil-r-10 avatar Aug 10 '23 08:08 senthil-r-10

HI @senthil-r-10 Were you able to reproduce the MIM task?

arundprabhu avatar Jan 31 '24 06:01 arundprabhu