cortex.cpp icon indicating copy to clipboard operation
cortex.cpp copied to clipboard

epic: Benchmarking existing good models

Open hiro-v opened this issue 1 year ago • 0 comments

Problem

  • As an model user day to day, I find it hard to explain and share to my friends which model is good to use, especially with the help of Nitro

Success Criteria

  • Public markdown for comparison on Nitro page, can refer to this but can be a lot simpler: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard
  • The performance metrics should be generated using https://github.com/ray-project/llmperf as de-factor tool to measure with below table Screenshot 2023-11-23 at 01 40 03
  • The perplexity metrics should be measured as below table with this tool: https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#perplexity-measuring-model-quality Screenshot 2023-11-23 at 01 41 14

Sub Issues

  • To be updated

Additional context

  • The result should come with OS, CPU architecture, RAM, model name, GPU (Metal/ NVIDIA GPU/ etc)

hiro-v avatar Nov 22 '23 18:11 hiro-v