CALM-pytorch icon indicating copy to clipboard operation
CALM-pytorch copied to clipboard

Possible to load huggingface's pretrained models in anchor_llm & augment_llm?

Open prashantkodali opened this issue 1 year ago • 5 comments

In the code-snippet below, is it possible to load Decoder/Encoder with pre-trained models from huggingface hub?

augment_llm = TransformerWrapper(
    num_tokens = 20000,
    max_seq_len = 1024,
    attn_layers = Decoder(
        dim = 512,
        depth = 12,
        heads = 8
    )
)

anchor_llm = TransformerWrapper(
    num_tokens = 20000,
    max_seq_len = 1024,
    attn_layers = Decoder(
        dim = 512,
        depth = 2,
        heads = 8
    )
)

prashantkodali avatar Mar 25 '24 10:03 prashantkodali

hi, do you solve the problem?

Mangoho avatar May 19 '24 17:05 Mangoho

@lucidrains any solution for this issue?

OmarMohammed88 avatar Jun 04 '24 03:06 OmarMohammed88

@prashantkodali do you find any solutions?

LitterBrother-Xiao avatar Oct 07 '24 09:10 LitterBrother-Xiao

Hello @LitterBrother-Xiao - I implemented this a while back - specific to Encoder based models. I used PyTorch's forward hooks to implement the idea.

The approach didn't work for me - i didnt clean and upload the code - but can share it if it helps you.

Also, authors of the paper released the codebase a couple of months back - https://github.com/google-deepmind/calm. Hope this helps.

prashantkodali avatar Oct 09 '24 09:10 prashantkodali

@prashantkodali thanks so much!

LitterBrother-Xiao avatar Oct 10 '24 16:10 LitterBrother-Xiao