Evan Jones comments

Results 46 comments of


                                            Evan Jones

Save and restore prompt evaluation state for much faster startup times

@ggerganov that makes sense, thanks! I did notice there was a long stream of zeroes in the files. @Priestru not sure, perhaps there's a higher cost to the initial token(s)...

main: add the possibility to open the prompt cache read-only

LGTM! I'll hold off on the accept for now in case someone else has objections. One thought: `--prompt-cache-all` doesn't seem to make sense in conjunction with this new option; I...

main: add the possibility to open the prompt cache read-only

Ah, you're right, sorry. This fell off my radar. Re: mmap, I think that's a reasonable direction. When I implemented the session/prompt cache I just didn't have the confidence in...

[User] Segfault when saving session cache since ecb217d

I poked a bit at this this morning, tried increasing the copy ctx size slightly but that doesn't seem to be the issue. It does seem like the new tensor...

[User] Segfault when saving session cache since ecb217d

This seems to get the prompt cache working at least: ``` diff --git a/ggml.c b/ggml.c index 34212b8..62ac19f 100644 --- a/ggml.c +++ b/ggml.c @@ -5975,12 +5975,12 @@ struct ggml_tensor * ggml_view_3d(...

Final trailing LF stripped off prompt when using --file

I agree this is surprising. I believe the newline stripping happens [here](https://github.com/ggerganov/llama.cpp/blob/7d873811f31d4d8c909015c946a862c0089cda7d/examples/common.cpp#L146-L148) when handling the `--file` argument. My impression is this was done to simplify chat-style prompts stored in files,...

Evan Jones

Save and restore prompt evaluation state for much faster startup times

main: add the possibility to open the prompt cache read-only

main: add the possibility to open the prompt cache read-only

[User] Segfault when saving session cache since ecb217d

[User] Segfault when saving session cache since ecb217d

Final trailing LF stripped off prompt when using --file

Force response format and bias responses by regex

Force response format and bias responses by regex

[Feature Request] --prompt-cache-all + user input

[Feature Request] --prompt-cache-all + user input