Uzair rehman

Results 5 comments of Uzair rehman

Hey Dylan I have a question, any assistance will be highly appreciated. I want to convert DeepSeek-R1-Llama-8B into .bin format can I use the same export.py for this?

I already tried this and it works like you can make a .bin file from DeepSeek-Distill-Llama-8B but the provided tokenizer.bin file is not compatible I guess I need to figure...

Thank you @Dylan-Harden3 really appreciate it I will look into it.

@Dylan-Harden3 can you help me with the problem: size mismatch for model.layers.31.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 4096]) from checkpoint, the shape in current model is torch.Size([4096, 4096]). size...

Hi @MikeOpenHWGroup I have closed issue #2630 as the PR was merged. Please go on with the changes and you can assign me the review.