Christian Schlarmann
Christian Schlarmann
How about adding a "help" button to the side bar which directs to a help-page containing FAQ and footer?
I'm working on it.
Hi, thanks for asking. We demonstrate zero-shot _classification_ only for the CLIP models on their own and consider LLaVA and OpenFlamingo for captioning/VQA tasks.
You're right, it should definitely be possible to run with larger batch sizes, it's just hardcoded to batch_size 1 in a few places since we couldn't fit much more on...
No problem :) We basically stick to how the models are evaluated in their respective papers, so greedy decoding without beam-search for LLaVA, and beam search with 3 beams for...
Hi, Thanks for sharing. The generated images look very nice! We have also looked a bit into the interpretability of adversarial perturbations for robust CLIP as part of another project,...
Hi, we train the ViT-L/14 on 4x NVIDIA A100 40GB at total batch size 128