Martin Evans
Martin Evans
Thanks for confirming that. I'll do some more digging into this to see if I can turn up anything more.
I tried running the BoolQ dataset again, but this time asking each question in N parallel sequences. As far as I can tell this always produces the same answer across...
As far as I know this issue is still relevant!
> The general rule is that all contributions need working tests, to ensure some level of quality for the code. We discussed this in Discord and concluded there's not a...
I don't understand the purpose of this change? The idea of `Conversation` is to a wrapper around the concept of a sequence and all the things you ca do with...
> You could certainly do everything in Conversation I'm not suggesting adding anything extra to `Conversation` - what I meant by that question is what can be built _with_ (i.e....
Before answering your questions, here the way I'm thinking about things: ### Low Level Just the functions and data structures that llama.cpp offers. Sometimes with a small layer of safety,...
> Text decoder (optional) p.s. not a streaming one This has come up a few times in recent discussions but I don't think it is **possible** to have a non-streaming...
> There's two batches in this workflow If I'm understanding this example correctly I don't think two batches are necessary. When `DecodeAsync` is called the `LLamaBatch` is copied into buffers...
That's probably something that would be built into an app _on top_ on LLamaSharp, rather than directly into LLamaSharp itself.