llama.cpp Study how LM Evaluation Harness works and try to implement it

Study how LM Evaluation Harness works and try to implement it

Open ggerganov opened this issue 1 year ago • 2 comments

It would be great to start doing this kind of quantitative analysis of ggml-based inference:

https://bellard.org/ts_server/

It looks like Fabrice evaluates the models using something called LM Evaluation Harness:

https://github.com/EleutherAI/lm-evaluation-harness

I have no idea what this is yet, but would be nice to study it and try to integrate it here and in other ggml-based projects. This will be very important step needed to estimate the quality of the generated output and see if we are on the right track.

Mar 17 '23 08:03 ggerganov

llama.cpp llama.cpp copied to clipboard

Study how LM Evaluation Harness works and try to implement it

llama.cpp
llama.cpp copied to clipboard