alpaca_eval
alpaca_eval copied to clipboard
possibility of adding llama3-70b as the evaluator?
GPT-4 is so expensive if we have to run hundreds of experiments for scientific studies. I wonder whether you have tried using Llama3-70b, which performs comparably to the older version of GPT-4, as the evaluator and seen its correlation with humans. That is a more affordable replacement for GPT-4, given that Llama3-70b on Together AI is only 1/12 the price of GPT-4.