vince
vince copied to clipboard
Replicating ImageNet linear probe numbers
I was trying to replicate the ResNet50 (MocoV2, MocoV2-R2V2 and Vince) linear probe numbers on ImageNet. While the paper states that the linear probe was a single linear layer on top of the pretrained frozen representation (I downloaded pretrained weights shared on the repo), the code for linear probing trains two decoders (a single linear layer and a double linear layer with dropout and non-linearity) simultaneously.
Also, it is unclear to me what representation is the linear probing done on top of? The 2048 dim output of resnet50(-2) or one of the embedding layers that follow it. Could you please clarify?
The multi-linear layer with dropout is for an experiment that didn't end up going in the paper. All the linear probes were on the "feature_extractor" of the model: https://github.com/danielgordon10/vince/blob/master/solvers/end_task_base_solver.py#L202 As you can see here: https://github.com/danielgordon10/vince/blob/master/models/vince_model.py#L26 that would be the output of the average-pool layer of the ResNet. The embedding space is only used for the training task.