ActionCLIP icon indicating copy to clipboard operation
ActionCLIP copied to clipboard

TinyCLIP integration for ActionCLIP

Open FransHk opened this issue 1 year ago • 0 comments

This PR integrates two TinyCLIP ViT models to the existing model framework with minimal changes. This is possible because TinyCLIP provides a pure ViT-based model, like CLIP. The TinyCLIP model is a CLIP distillation that provides significant speed-ups to the CLIP model while retaining and in some cases improving its zero-shot IN1K accuracy. A small state_dict conversion helper method and optional sha256 ignore flag are added to accommodate for this integration.

The TinyCLIP paper

The TinyCLIP models (Git)

Graphs below show rough indication of ActionCLIP during train time on HMDB51 (no pre-train). Train step indicates the batches processed per minute (wall clock) time. TinyCLIP-based ActionCLIP model trains much faster while performance is almost similar to vanilla CLIP. wandb_comparison

FransHk avatar Dec 06 '23 10:12 FransHk