llama2.c icon indicating copy to clipboard operation
llama2.c copied to clipboard

Inference Llama 2 in one file of pure C

Results 146 llama2.c issues
Sort by recently updated
recently updated
newest added

Clone of llama2.c but updated to work with Llama 3.2 1B/3B base and instruct

The weights are natively bfloat16. Rather than convert them into float, you could just keep them as bfloat16 and convert between float and bfloat16 on the fly using a union...

Why is the termination condition of the `generate` function `next = 1` (BOS) instead of `next = 2` (EOS)?

Hi, I believe that the bias is not removed in the quantize() function. This would be necessary to have a symmetric Q8_0 quantization of activations. Is that not needed? ```...

Is the export.py only created for model in run.c ? I use it to export hf model to a model.bin, but it doesn't work when I use it in train.py,...