llama2.c Evolution of tinystories. Open sourced.

Evolution of tinystories. Open sourced.

Open xefoci7612 opened this issue 2 years ago • 3 comments

Textbooks Are All You Need II: phi-1.5 technical report

We follow the “Textbooks Are All You Need” approach, focusing this time on common sense reasoning in natural language, and create a new 1.3 billion parameter model named phi-1.5, with performance on natural language tasks comparable to models 5x larger, and surpassing most non-frontier LLMs on more complex reasoning tasks such as grade-school mathematics and basic coding. More generally, phi-1.5 exhibits many of the traits of much larger LLMs, both good –such as the ability to “think step by step” or perform some rudimentary in-context learning– and bad, including hallucinations and the potential for toxic and biased generations –encouragingly though, we are seeing improvement on that front thanks to the absence of web data. We open-source phi-1.5 to promote further research on these urgent topics

"We hope that phi-1.5’s size will make experimentation easier than with larger open-source models such as the Llama family"

Sep 12 '23 09:09 xefoci7612

What timing...I just read about Toolformer yesterday: https://arxiv.org/pdf/2302.04761.pdf

What is really interesting is figure 4. It shows that somewhere between 750M-1B parameters is when it appears models become smart enough to figure out how to use tools.

I wonder what happens when you combine the learnings of both of these papers and how far we can push ~1B parameter models.

Sep 12 '23 14:09 SpaceCowboy850

What are you trying to say? How is this an issue?

Sep 17 '23 19:09 dfurrer

Btw.

The model + weights are open source but not the dataset, making it a lot less interesting as a learning example
There’s actually some questions about data contamination (https://twitter.com/suchenzang/status/1701615026648605095)

Sep 17 '23 19:09 dfurrer

llama2.c llama2.c copied to clipboard

Evolution of tinystories. Open sourced.

llama2.c
llama2.c copied to clipboard