Lorenzo Baraldi

Results 4 issues of Lorenzo Baraldi

**Is your feature request related to a problem? Please describe.** There is no problem or bug **Describe the solution you'd like** I would like the implementation of BEIT pre-training pipeline...

enhancement

Added parameter 'iters_to_accumulate' to perform [gradient accumulation](https://pytorch.org/docs/stable/notes/amp_examples.html#working-with-scaled-gradients) during training.

**Describe** Hi, I would like to know if layer scale is used (at 0.1) in finetuning BeiTv2 on the classification task. From the code point of view it seems that...

Hi, after reading your paper and studying the code, I don't understand why VisionTransformerForMaskedImageModeling have two implementations of the encoder (respectively encoder and teacher model). Why is it not possible...