Loading model weights partially
Hello, I really appreciate this amazing work. The problem I face is the models are a bit large to run on limited hardware. My question is, is it possible to partially load the model? For example, I need the embeddings extracted from sequence and coordinates tracks for a downstream task and other tracks are not really even used, so they’re loaded into GPU for no reason. Is it possible to load weights only for specified tracks, which I believe would significantly reduce unnecessary VRAM usage? Thank you so much.
I believe the encoders and decoders won't be loaded if you never call the encode and decode functions.
https://github.com/evolutionaryscale/esm/blob/30eb9070f442abd33e35ceebe1fe1fb91e1ba5aa/esm/models/esm3.py#L222-L224
Otherwise, try wrapping the model forwards in a torch.no_grad.