feed_forward_vqgan_clip
feed_forward_vqgan_clip copied to clipboard
Finetuing CLIP to improve domain-specific performance
It's quite easy to finetune one of the Open AI CLIP checkpoints with this codebase:
https://github.com/Zasder3/train-CLIP-FT
Uses pytorch-lightning. May be worth pursuing
I would also be curious to see if training/fine-tuning CLIP on a higher resolution (e.g. 512x512 instead of 224) would also lead to better image quality for higher resolutions( >= 512)