LLM-eval-survey icon indicating copy to clipboard operation
LLM-eval-survey copied to clipboard

Add Llama 2 as model evaluated?

Open tiansiyuan opened this issue 1 year ago • 3 comments

tiansiyuan avatar Aug 10 '23 08:08 tiansiyuan

Could you please be more specific? Where should we add this model?

jindongwang avatar Aug 29 '23 03:08 jindongwang

In the paper, Llama is mentioned twice, both on page 6.

The first one is from a paper (Saparov et al., 2023), so just keep it.

The second one,

"Moreover, LLaMA-65B is the most robust open-source LLMs to date, which per- forms closely to code-davinci-002."

could be replaced by

"Moreover, LLAMA 2 70B is the most robust open-source LLMs to date, which performs very closely to GPT-3.5 and PaLM. But there is still a large gap in performance between LLAMA 2 70B and GPT-4 and PaLM-2-L.(Touvron et al., 2023)"

As code-davinci-002 is a code generation model derived from GPT-3, I think it is not appropriate to compare it with a pretrained model such as LLaMA. Just for your consideration.

Also, I'd suggest to add the following paper as reference.

Llama 2: Open Foundation and Fine-Tuned Chat Models

tiansiyuan avatar Sep 01 '23 02:09 tiansiyuan

Thanks for the detailed suggestion! We'll update the paper accordingly.

jindongwang avatar Sep 01 '23 02:09 jindongwang