zer0int
zer0int
Here's an initial implementation of Inf-CLIP with GmP, if you're interested: [github.com/zer0int/Inf-CLIP](https://github.com/zer0int/Inf-CLIP)
Hi @whats2000, thank you for your interest - I'm happy to hear you find the code useful! Your proposed citation looks good to me, feel free to use as-is. However,...
I'm curious, what are your results beyond diffusion tasks / using the Text Encoder? I'd be keen to hear more about that, if you're able to share! I am aware...
In that case, you might be interested in also experimenting with my most recent CLIP model, which has MLP gates in the Vision Transformer. The zero-shot (ImageNet/ObjectNet) accuracy is slightly...
I was able to reproduce this even for 1024x1024 images by shuffling components (MLP, Attn) of either Flux.1-Dev, but also of the Text Encoders. Especially only shuffling T5 MLP over...