zamba icon indicating copy to clipboard operation
zamba copied to clipboard

Don't download model weights and load imagenet weights if using models for inference

Open pjbull opened this issue 3 years ago • 3 comments

We currently use load_from_checkpoint in our ModelManager to initialize models when doing inference. This can cause the models to download the pretrained imagenet weights from the internet even thought we don't need those. To address this, we need a parameter that we pass in to the __init__ of the model to indicate we are doing inference/loading from a checkpoint, and then we need to pass this paramter to load_from_checkpoint in the ModelManager.

We should check across all of our models for this behavior, but this is how it works for the time_distributed model:

  • Here is where we need to pass a parameter indicating that we're doing inference: https://github.com/drivendataorg/zamba/blob/7fb2a0f9599bf55bf2f538d6a0a736963cb9d9eb/zamba/models/model_manager.py#L98-L101

  • Here is where we need to use that param to skip intializing from timm: https://github.com/drivendataorg/zamba/blob/7fb2a0f9599bf55bf2f538d6a0a736963cb9d9eb/zamba/models/efficientnet_models.py#L23-L27

pjbull avatar Oct 21 '21 23:10 pjbull

This also arises if we are training a model that has labels which are a subset of the zamba labels, which means we "resume training" instead of replacing the head. This stems from the fact that finetune_from is still None in this case; we should instead do model_class(finetune_from={official_ckpt}) rather than load from checkpoint

https://github.com/drivendataorg/zamba/blob/7986c417f33839c0a8d14ac66201472acbfb393a/zamba/models/model_manager.py#L139-L147

In addition: we may want super().load_from_checkpoint instead here to avoid re-passing through the init with the timm weight download: https://github.com/drivendataorg/zamba/blob/7fb2a0f9599bf55bf2f538d6a0a736963cb9d9eb/zamba/models/efficientnet_models.py#L27

ejm714 avatar Jun 07 '22 23:06 ejm714

The code has changed a lot. Has this bug been resolved? I'm trying to work on this if it's not resolved.

papapizzachess avatar Nov 23 '23 16:11 papapizzachess

@papapizzachess yes this bug still exists and the code sections in the issue description are still correct.

ejm714 avatar Nov 27 '23 19:11 ejm714