AImageLab
AImageLab
mvad-names-dataset
M-VAD Names Dataset. Multimedia Tools and Applications (2019)
perceive-transform-and-act
PyTorch code for the paper: "Perceive, Transform, and Act: Multi-Modal Attention Networks for Vision-and-Language Navigation"
speaksee
PyTorch library for Visual-Semantic tasks
STAGE_action_detection
Code of the STAGE module for video action detection
multimodal-garment-designer
This is the official repository for the paper "Multimodal Garment Designer: Human-Centric Latent Diffusion Models for Fashion Image Editing". ICCV 2023
open-fashion-clip
This is the official repository for the paper "OpenFashionCLIP: Vision-and-Language Contrastive Learning with Open-Source Fashion Data". ICIAP 2023
pacscore
Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation. CVPR 2023
mil4wsi
DAS-MIL: Distilling Across Scales for MILClassification of Histological WSIs
LiDER
Official implementation of "On the Effectiveness of Lipschitz-Driven Rehearsal in Continual Learning"