evit
evit copied to clipboard
Some question
In the paper, EViT with oracle can obtain higher accuracy when training longer epochs. Similar results are also shown in the DeiT paper. Thus I think the comparison is not very fair. 600 epoch means training 900 epoch in fact.
Besides, does EViT work for small FLOPs? What about 1/4 FLOPs of DeiT (4.6G/4=1.2G)?
![image](https://user-images.githubusercontent.com/24236723/156888363-68b1c205-ebf2-41cb-9c7c-c713c515b975.png)
![image](https://user-images.githubusercontent.com/24236723/156887759-cac773a1-02e4-4bf5-a846-304facd34f11.png)
![image](https://user-images.githubusercontent.com/24236723/156887796-44c9da28-5cf7-4020-9bbc-b6b7910db340.png)