can-ai-code Guide on how to evaluate models

Guide on how to evaluate models

Open kisimoff opened this issue 4 months ago • 1 comments

Im willing to test a few models and share the results. I've looked at the readme, but couldn't wrap my head around how to benchmark a model. Any help would be appriciated!

Apr 03 '24 09:04 kisimoff

The docs definitely need a rewrite my apologies here.

The general flow is:

prepare.py
interview*.py
eval.py

In the dark days we had to deal with dozens of prompt formats, but these days prepare.py can be run with --chat hfmodel and it will sort it out.

Note that there are two interviews junior-v2 and senior, I usually only run senior on strong models that get >90% on junior.

Apr 07 '24 23:04 the-crypt-keeper

can-ai-code can-ai-code copied to clipboard

Guide on how to evaluate models

can-ai-code
can-ai-code copied to clipboard