ru-dalle icon indicating copy to clipboard operation
ru-dalle copied to clipboard

ImageNet classification with ru-dalle?

Open prnvjb opened this issue 2 years ago • 2 comments

Hi Team, Thanks for the excellent contribution to open source. I've been trying to adapt your code. I'm mostly focused on getting image embeddings from the given image and train a classifier on top of it. I guess dalle code is composed on text and image embeddings. Any direction on generation image feature vector, what part of code I should modify?

Any help would be greatly appreciated.

Thanks.

prnvjb avatar Nov 22 '21 20:11 prnvjb

You can try to use vqgan image encoder and mlp head for classification. but better to use VIT/RN50 and other

AlexWortega avatar Nov 25 '21 15:11 AlexWortega

Thanks for the response. I just want to check ru-dalle's image encoder performance on zero shot image classification.

prnvjb avatar Nov 25 '21 16:11 prnvjb