adamae
adamae copied to clipboard
[CVPR'23] AdaMAE: Adaptive Masking for Efficient Spatiotemporal Learning with Masked Autoencoders
Hi @wgcban Thank you for your paper and code for AdaMAE. In this [line](https://github.com/wgcban/adamae/blob/main/models/pretrain/vit.py#L102) a Multinomial distribution is used for sampling the indices for the visible tokens given the probability...
I'm using the fine_tunning model in my code with this approach to load weights in torch2.6, in "init" for [ViT class](https://github.com/wgcban/adamae/blob/main/finetune_class.py): ```py [...] self.head.weight.data.mul_(init_scale) # type: ignore self.head.bias.data.mul_(init_scale) # type:...