CLIP
CLIP copied to clipboard
Are linear probes for ResNet50 also before projection ?
If I understand correctly when performing linear probing you take the representations before the linear projection heads. In the code, this can be done very nicely thanks to this line: https://github.com/openai/CLIP/blob/3b473b0e682c091a9e53623eebc1ca1657385717/clip/model.py#L233
It is less clean for ResNet architecture, which makes me wonder whether you do the same for ResNets?
Here's the code I'm currently using assuming that I have to remove the projection head for both ViT and ResNet:
model, preprocess = clip.load(model, device, jit=False)
encoder = model.visual # only keep the image model
if hasattr(encoder, "proj"):
encoder.proj = None
else:
# set manually the projection head to identity while ensuring that still linear layer:
N = encoder.attnpool.c_proj.in_features
identity = torch.nn.Linear(N, N)
nn.init.zeros_(identity.bias)
identity.weight.data.copy_(torch.eye(N))
encoder.attnpool.c_proj = identity
Thanks for the easily usable code + pretrained weights ♡
I'm working on linear probe fine-tuning too, but I can't see where encoder.proj is being turned off in the official linear probe code, can you show me please?
I have the same question, after reading the CLIP paper. What exactly does "linear probing on ResNet" mean?