Kerfuffle

Results 159 comments of Kerfuffle

> To clarify, I used the word "prompt" to mean a message that I send to the model (eg. in your example, "Do the thing." is that I would call...

> Thanks for the clarification, I forgot about the fact that I tried `--interactive-first` and was able to do 1 prompt before the program starts printing out non-stop. Like I...

Not sure if it helps, but I have a GGML-based Rust implementation here: https://github.com/KerfuffleV2/smolrsrwkv/blob/main/smolrwkv/src/ggml/graph.rs (that's just v4 inference) This is actually the reason I made my first contribution to the...

Is there a way to make RWKV's state stuff fit in with the current concept of sequences and KV cache manipulation? _Can_ you do parallel generation with multiple independent sequences?

If it's helpful, I asked some questions in the RWKV discord: *** **[2:06 AM] Kerfuffle:** This might be a pretty dumb question, but just thinking about how RWKV could fit...

> you can save RWKV state per n tokens. and you can save them to ram / hd. I'm looking at it from the perspective of how it can be...

The fact that it is taking weeks to locate an Intel GPU at Intel is really not inspiring much confidence and AMD is coming out with a cheap 16GB VRAM...

> You can decide whatever you want to do but there are paths forward. This is only an issue with using the latest code and ComfyUI. Using the previous version...

If you got it from a source the one that starts with an H, the first thing would be to check if the SHA256 matches the file you downloaded. It's...

This isn't currently expected to work with cuBLAS, correct? When attempting to run compiled with cuBLAS I get: ```plaintext invalid backend: 33 GGML_ASSERT: ggml-tune.c:691: false ``` Please let me know...