Kerfuffle comments

Results 159 comments of


                                            Kerfuffle

Support for RWKV

Oh, nice. I look forward to ripping off their ide... I mean collaborating in the spirit of open source. *** This is probably poor etiquette but there's no ability to...

Support for RWKV

@saharNooby > Looks like our work is closely related by extending ggml, but diverges at actual implementation of the model -- you do it in Rust, I do it in...

Interesting. Is there a reason to implement those elementwise operations all separately instead of adding a generic elementwise map operation? The matrix multiplications matter so much with this, it's crazy....

Support for RWKV

I can't really help you with the C++ part. Come over to the Rust side! In seriousness though, you may well end up doing less work overall if you take...

Support for RWKV

Might be getting annoying me writing so many comments here, but: I've been working on my Rust RWKV implementation and got 8bit quantization working. I also managed to split it...

Support for RWKV

@saharNooby Uhhh... I basically cargo culted it from the official version so I don't know that I can give you a good answer here. See: 1. https://github.com/BlinkDL/ChatRWKV/blob/0d0abf181356c6f27501274cad18bdf28c83a45b/rwkv_pip_package/src/rwkv/model.py#L237 2. https://github.com/BlinkDL/ChatRWKV/blob/0d0abf181356c6f27501274cad18bdf28c83a45b/rwkv_pip_package/src/rwkv/model.py#L335 The...

Support for RWKV

I've been messing around trying to allow GGML to map arbitrary operations: https://github.com/KerfuffleV2/llama-rs/blob/5fd882035e95501d4127e30c84a838afbffcc95e/ggml/src/lib.rs#L207 This what it looks like in use: https://github.com/KerfuffleV2/llama-rs/blob/5fd882035e95501d4127e30c84a838afbffcc95e/llama-rs/src/lib.rs#L1310 The first one is just replacing the `ggml_add` operation,...

Kerfuffle

Support for RWKV

Support for RWKV

Support for RWKV

Support for RWKV

Support for RWKV

Support for RWKV

Support for RWKV

Build and execute our own computation graph

Build and execute our own computation graph

Parallel loading of the model tensors