Li Xin

Results 3 issues of Li Xin

On ImageNet, I only get a top-5 accuracy of 83.86%. Any idea what's wrong?

Have you compaired the inference speed of T2T-ViT and resnet? At the same accuracy, which famaliy of models achieve higher FPS?

The "field_h" and "filed_w" of "label_1_5x5" is 60x60. What does the "field" mean? It's not the anchor size in your table 1 (which is 40x40). It's not the receptive field...