zero-shot-classification topic
CVPR2024_MAVL
Multi-Aspect Vision Language Pretraining - CVPR2024
MSQNet
Actor-agnostic Multi-label Action Recognition with Multi-modal Query [ICCVW '23]
MixCon3D
[CVPR 2024] The official implementation of paper "Sculpting Holistic 3D Representation in Contrastive Language-Image-3D Pre-training"
text-to-image-eval
Evaluate custom and HuggingFace text-to-image/zero-shot-image-classification models like CLIP, SigLIP, DFN5B, and EVA-CLIP. Metrics include Zero-shot accuracy, Linear Probe, Image retrieval, and KNN...
vision-at-a-clip
Low-latency ONNX and TensorRT based zero-shot classification and detection with contrastive language-image pre-training based prompts
simple-clip
A minimal, but effective implementation of CLIP (Contrastive Language-Image Pretraining) in PyTorch