moonlitt
Results
1
comments of
moonlitt
Hi, there are many reasons: 1. Our model is pre-trained on weak semantic correlation data crawled from the web while ViLT is pre-trained on strong semantic correlation data. Flickr30K is...