Mayank Mishra comments

Results 187 comments of


                                            Mayank Mishra

NotImplementedError: Cannot copy out of meta tensor; no data!

I think your environment is configured with CUDA 11.1 and torch is compiled using 10.2. Can you install torch using the same CUDA version?

NotImplementedError: Cannot copy out of meta tensor; no data!

The dockerfile works out of the box. Can you give it a shot?

NotImplementedError: Cannot copy out of meta tensor; no data!

Not sure why 176b is not working. I will try to look into it :)

Add configs to run int4 inference

Huh? Int4? I will test this branch surely and let you know. Thanks a lot for this :)

[Icon Request]: Add icon for safetensors

hey this is awesome

Add Support for IBM Granite

@sroecker are you tying the word embeddings? unlike llama, the input word embeddings and output projection matrix are tied for granite models

Add Support for IBM Granite

the lab version is a different model not to be confused with this one

Add Support for IBM Granite

Hmm, a quick question: are we tying the word embeddings and output logits matrix? llama doesn't do that and granite has tied embeddings. maybe thats the issue? I don't think...

Add Support for IBM Granite

Hmm, ok so there are these differences between llama and granite: 1. attention has bias (llama doesn't) 2. mlp has bias (llama doesn't) 3. tied word embeddings (llama doesn't) 4....

Add Support for IBM Granite

yeah all are using starcoder tokenizer.