open_llama icon indicating copy to clipboard operation
open_llama copied to clipboard

Why is this better than llama in some instances?

Open rick2047 opened this issue 2 years ago • 1 comments

I was going through the readme and noticed here that this model is performing better than the 7B llama on many things, even though its trained on a fifth of the tokens (200B vs 1T). Does anyone understand how this happened?

rick2047 avatar May 04 '23 12:05 rick2047

Probably GIGO (Garbage In Garbage Out), the two models are trained on different datasets.

ClaudeCoulombe avatar May 04 '23 19:05 ClaudeCoulombe