llama2.c
llama2.c copied to clipboard
Inference Llama 2 in one file of pure C
Thanks for the amazing, clean library. I had a few issues running it on MPS, I thought I'd share how I got it to work. May not be suitable for...
[Textbooks Are All You Need II: phi-1.5 technical report](https://arxiv.org/abs/2309.05463) **We follow the “Textbooks Are All You Need” approach, focusing this time on common sense reasoning in natural language**, and create...
In the training script, after forwarding the data to the model, there's [a line of code](https://github.com/karpathy/llama2.c/blob/766a30bc6e9a1c69ce007bb69caabf4c6062f0e9/train.py#L308) that says: ``` # immediately async prefetch next batch while model is doing the...
https://github.com/tairov/llama2.mojo Imagine 250x speed on the original...
I trained and LoRA fine-tuned (inspired from wlamond's PR) the models to follow instructions and write tiny stories accordingly, with the prompt data available. [Repo](https://github.com/cindysridykhan/instruct_storyteller_tinyllama2) [blogpost](https://medium.com/@cindy.sridykhan/how-to-train-and-fine-tune-an-instruct-llama2-model-in-pytorch-6cbe11de2b34) Demo:  Any feedback...
I started looking into whether the small models could be a target for interpretability. I'm just putting it out here in lack of a better space to find people wanting...
Dear llama2.c developer, Greetings! I am vansinhu, a community developer and volunteer at InternLM. [InternLM](https://github.com/InternLM/InternLM) is a large language model similar to llama2, and we look forward to InternLM being...
Hi, Thank you for the fantastic repo. I recently picked up interest in FSDP. Is it possible to adapt this model to FSDP? If yes, what are the things that...
The new export code instantiates a Transformer() model so it needs double the memory (float32 weights vs bfloat16 in the Meta checkpoints). The old llama only exporter code can easily...
I rewrote the code of the most critical function `matmul()` and it works faster in about 3.5-4 times (on Mac with ARM M1 Max) then the original. May be it...