Prince Canuma
Prince Canuma
@awni any thoughts?
Thanks! Looking forward to it :)
Thanks @JosefAlbers! Found the bug and fixed it :)
@awni @lucasb-eyer it's fixed ✅ After my changes, I didn't update the gemma embedding scaling to all inputs (text and multimodal). It was only scaling text embeddings. That's why when...
@JosefAlbers could you share your X handle ? I want to tag you on the release :)
The KVcache change broke my original implementation but I found a work arround it :) Ready for reaview @awni ✅
Hey @awni, It's done ✅
Thank you very much, @awni! I addressed all comments, Let me know if there anything else
Hey guys @awni, @fblissjr and @jeanromainroy, The cohere team limited the context to 8k for all Command-R variants on purpose. If you check the config file for both r-v01 and...
> Copying the original cohere tokenizer.json (https://huggingface.co/CohereForAI/c4ai-command-r-plus/blob/main/tokenizer.json) fixes this issue completely from my testing (output generation is slow, but so far so good!) > > My guess is something is...