GB

Results 43 comments of GB

> It will, but as mentioned earlier performance will be really poor. Ah okay, I am cloning them now instead for the moment. At least it sounds like it will...

> Great that it ends up working well. Not sure what you mean by "avoid having the kv cache fill up too", the cloning strategy should work for most use...

Hmmm it actually does this still with either clearing or cloning :/ Seems like I should be doing it right, not sure what is happening potentially to still do it....

Also note that now I am seeing that over time the stable diffusion becomes all black. Is it also in need of some kv refresh type issue? I didn't have...

Ah this runs my 192 gig M2 Ultra out of ram :/ It seems crazy loading 100+ till it breaks for me. How do we use the quantized version exactly?...

It seems to have this issue on Metal? With Metal failing ``` cargo run --example quantized --release --features metal -- --prompt 'how are you?' --which mixtral-instruct --gqa 8 -n 300...

Curious if there's an answer to this? Audio seems harder to send, I can't find good examples working in the Rust bindings :/ Thanks!

> @groovybits I added the missing method into this library, The PR still unmerged. But I have this working in one of my projects decoding from FFMPEG and playing across...

> We're certainly lacking a good TTS example at the moment, as pointed out in #1428 (we already cover speech to text with whisper, and both image to text and...

Exciting amazing progress! I seem to get a failure with using cpu or metal. With CPU you can see here it outputs information but doesn't use GPU/CPU and sits there...