Prince Canuma comments

Results 151 comments of


                                            Prince Canuma

PaliGemma 4bit Quantization broken and Inference issues.

@awni any thoughts?

PaliGemma 4bit Quantization broken and Inference issues.

Thanks! Looking forward to it :)

PaliGemma 4bit Quantization broken and Inference issues.

Thanks @JosefAlbers! Found the bug and fixed it :)

PaliGemma 4bit Quantization broken and Inference issues.

@awni @lucasb-eyer it's fixed ✅ After my changes, I didn't update the gemma embedding scaling to all inputs (text and multimodal). It was only scaling text embeddings. That's why when...

PaliGemma 4bit Quantization broken and Inference issues.

@JosefAlbers could you share your X handle ? I want to tag you on the release :)

Add support for ibm granite

The KVcache change broke my original implementation but I found a work arround it :) Ready for reaview @awni ✅

Add support for ibm granite

Thank you very much, @awni! I addressed all comments, Let me know if there anything else

Command-R-Plus, Context Window Limitations

Hey guys @awni, @fblissjr and @jeanromainroy, The cohere team limited the context to 8k for all Command-R variants on purpose. If you check the config file for both r-v01 and...

Command-R-Plus, Context Window Limitations

> Copying the original cohere tokenizer.json (https://huggingface.co/CohereForAI/c4ai-command-r-plus/blob/main/tokenizer.json) fixes this issue completely from my testing (output generation is slow, but so far so good!) > > My guess is something is...