compilade comments

Results 109 comments of


                                            compilade

llama : simplify Mamba with advanced batch splits

I've fixed the pooled embeddings problem with Mamba in by making it only process a single sequence per `ubatch`. When the sequences are short, this is slightly slower than processing...

imatrix: add option to display importance score statistics for a given imatrix file

> Not sure if I'm understanding the comment correctly @jukofyork, but the logic I'm using to identify the most influential tensors/layers is to simply average the importance scores (IS) for...

Suport for Jamba JambaForCausalLM

I'd like it very much if they released a smaller version of their model. I don't have enough RAM to run Mixtral (only have 8GB), and Jamba seems to be...

Suport for Jamba JambaForCausalLM

> Any update on Jamba support? I've worked on refactoring the KV cache in the past weeks to allow managing both recurrent states and Attention's KV cache at once. (See...

Suport for Jamba JambaForCausalLM

> For your endeavors, could I 'Buy You a Coffee' to help support? @severian42 I appreciate the offer (it means a lot!), but I can't accept for now. Receiving international...

Suport for Jamba JambaForCausalLM

Okay, turns out I only had to put like, 2 to 3 more days of work on this and BAM **it works**. As of today, in [branch `refactor-kv-cache`](), using the...

Suport for Jamba JambaForCausalLM

There is still more work I need to put into this. I've got inference working, but things that are not yet done are: - state saving and reloading to and...

Suport for Jamba JambaForCausalLM

> how can they work if the issue is not complete? @ELigoP Well, technically the layout of the GGUF files doesn't really need to be changed further for Jamba support,...

[New Bitnet Model Support Request] Deepgrove model Bonsai 0.5B - Add Channel Scales

> They adopt a channel wise scaling factor compared to the tensor level ones. Maybe a separate kennel can be built to apply scales outside of the matmul kernels? Hmm,...