llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

Study how LM Evaluation Harness works and try to implement it

Open ggerganov opened this issue 1 year ago • 2 comments

It would be great to start doing this kind of quantitative analysis of ggml-based inference:

https://bellard.org/ts_server/

It looks like Fabrice evaluates the models using something called LM Evaluation Harness:

https://github.com/EleutherAI/lm-evaluation-harness

I have no idea what this is yet, but would be nice to study it and try to integrate it here and in other ggml-based projects. This will be very important step needed to estimate the quality of the generated output and see if we are on the right track.

ggerganov avatar Mar 17 '23 08:03 ggerganov