Uzair rehman
Uzair rehman
Hey Dylan I have a question, any assistance will be highly appreciated. I want to convert DeepSeek-R1-Llama-8B into .bin format can I use the same export.py for this?
I already tried this and it works like you can make a .bin file from DeepSeek-Distill-Llama-8B but the provided tokenizer.bin file is not compatible I guess I need to figure...
Thank you @Dylan-Harden3 really appreciate it I will look into it.
@Dylan-Harden3 can you help me with the problem: size mismatch for model.layers.31.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 4096]) from checkpoint, the shape in current model is torch.Size([4096, 4096]). size...
Hi @MikeOpenHWGroup I have closed issue #2630 as the PR was merged. Please go on with the changes and you can assign me the review.