MFM
MFM copied to clipboard
[ICLR 2023] Masked Frequency Modeling for Self-Supervised Visual Pre-Training
Masked Frequency Modeling for Self-Supervised Visual Pre-Training
Introduction
This repository contains the official PyTorch implementation of the following paper:
Masked Frequency Modeling for Self-Supervised Visual Pre-Training,
Jiahao Xie, Wei Li, Xiaohang Zhan, Ziwei Liu, Yew Soon Ong, Chen Change Loy
In: International Conference on Learning Representations (ICLR), 2023
[arXiv][Project Page][Bibtex]
Updates
- [04/2023] Code and models of SR, Deblur, Denoise and MFM are released.
Models
ViT
ImageNet-1K Pre-trained and Fine-tuned Models
Method | Backbone | Pre-train epochs | Fine-tune epochs | Top-1 acc (%) | Pre-trained model | Fine-tuned model |
---|---|---|---|---|---|---|
SR | ViT-B/16 | 300 | 100 | 82.4 | config | model | config | model |
Deblur | ViT-B/16 | 300 | 100 | 81.7 | config | model | config | model |
Denoise | ViT-B/16 | 300 | 100 | 82.7 | config | model | config | model |
MFM | ViT-B/16 | 300 | 100 | 83.1 | config | model | config | model |
CNN
ImageNet-1K Pre-trained and Fine-tuned Models
Method | Backbone | Pre-train epochs | Fine-tune epochs | Top-1 acc (%) | Pre-trained model | Fine-tuned model |
---|---|---|---|---|---|---|
SR | ResNet-50 | 300 | 100 | 77.9 | config | model | config | model |
Deblur | ResNet-50 | 300 | 100 | 78.0 | config | model | config | model |
Denoise | ResNet-50 | 300 | 100 | 77.5 | config | model | config | model |
MFM | ResNet-50 | 300 | 100 | 78.5 | config | model | config | model |
MFM | ResNet-50 | 300 | 300 | 80.1 | config | model | config | model |
Installation
Please refer to INSTALL.md for installation and dataset preparation.
Pre-training
Please refer to PRETRAIN.md for the pre-training instruction.
Fine-tuning
Please refer to FINETUNE.md for the fine-tuning instruction.
Citation
If you find our work useful for your research, please consider giving a star :star: and citation :beer::
@inproceedings{xie2023masked,
title={Masked Frequency Modeling for Self-Supervised Visual Pre-Training},
author={Xie, Jiahao and Li, Wei and Zhan, Xiaohang and Liu, Ziwei and Ong, Yew Soon and Loy, Chen Change},
booktitle={ICLR},
year={2023}
}
Acknowledgement
This code is built using the timm library, the BEiT repository and the SimMIM repository.