lm-evaluation-harness update gguf backend to use Chat-completion API

update gguf backend to use Chat-completion API

Open falkbene opened this issue 7 months ago • 1 comments

trafficstars

The response structure for the logprobs of the /completion API was changed here: https://github.com/ggml-org/llama.cpp/commit/57bb2c40cd94c5a09f5210ed8264cc93b21c4b7e. Furthermore, /completion API is now Legacy ( https://platform.openai.com/docs/guides/completions). This commit adapts the gguf backend to utilize the /chat/completion API and now handles the logprobs response correctly. Moreover, this resolves an issue https://github.com/ggml-org/llama.cpp/issues/12591 where the llama-server did not recognize the echo parameter, as it is no longer necessary.

Mar 28 '25 12:03 falkbene

All committers have signed the CLA.

Mar 28 '25 12:03 CLAassistant

Hi! We should still keep the completions api as long as GGUF is supporting it. Otherwise will have to chat format the prompt for base models as well.

Apr 04 '25 15:04 baberabb

There is an issue with the current implementation as I pointed out in the issue mentioned in my last message. Starting a llama-server with the newest llama.cpp does not support the echo parameter anymore which is accessed in the lm-eval gguf file that I modified. Furthermore, the response structure of the logprobs that is expected in lm_eval/models/gguf.py was also changed in an update of Llama.cpp ( see last comment). So the current gguf implementation of LM-Evaluation-Harness throws errors when I use it. My edits should fix that at least for the gguf file, we could also use the completions API, but need to adapt the expected response structure.

Apr 06 '25 08:04 falkbene

I think we could still use the completions API, but still have to adapt to the response coming from the server, as the response structure has changed.

Apr 11 '25 19:04 falkbene

lm-evaluation-harness lm-evaluation-harness copied to clipboard

update gguf backend to use Chat-completion API

lm-evaluation-harness
lm-evaluation-harness copied to clipboard