Enable Prompt Caching by Default

Open BruceMacD opened this issue 1 year ago • 0 comments

I had to disable prompt caching due to requests getting stuck: #1994

We should bring this back when we have a mitigation for the inference issue: https://github.com/ggerganov/llama.cpp/issues/4989

Jan 16 '24 21:01 BruceMacD