Evan Jones
Evan Jones
Yeah, I tried a version where it restored and appended to the saved prompt, but I didn't want to have to rely on the contents of the prompt cache. There's...
Got a PR up for persistent chat: #1495. Note that it depends on #1032, still open.
For the persistent chat script, I have a PR up at #1568 with docs on its usage. For the `--prompt-cache` and `--prompt-cache-all`, the basic idea is to run `./main` with...
I'd also love to see this. `examples/chat-persistent.sh` invokes `main` for individual completions (rather than as a single interactive process) and resorts to hacks to extract generated text and token counts,...
Well, this is usable with `main` (as in the examples) as an input to `--grammar`. In general, I think it would be more complex to do in C++. And the...
Yeah, I thought based on the discussion that the JSON dependency meant that server had to be CMake-only and excluded from the Makefile. It does look like it's in the...
@slaren or @SlyEcho either of you interested in reviewing this?
There's two separate grammars here - `grammars/json.gbnf` is a standalone, sample grammar, while `examples/json-schema-to-grammar.py` stitches a grammar together dynamically based on a schema. I just opted to update the generic...
@ggerganov any interest in giving this a quick look?
Thanks! @SlyEcho I added support for the `integer` so that tutorial now runs up to the point that they split up the schemas: ``` % ./main -m $LLAMA2_13B_Q4_0 --grammar "$(...