llama2.c
llama2.c copied to clipboard
Inference Llama 2 in one file of pure C
Added a --version 3 to the parameters that is like legacy_export but exports data in fp16 format. ### Warning! needs test. I have too little memory (64GB are needed for...
[This is the same optimization proposed by PR#253 on the new codebase.] I believe the gain is substantial (especially in the case where the user prompt is quite large, like...
### Extract dataset functionality for easy extensibility Summary of changes: 1. Added `dataset.py` with `Dataset` base class. It encapsulates downloading and iterating over examples in files. There are 3 methods...
See https://github.com/karpathy/llama2.c/issues/332
Added export method to save matrix data in weights as bfloat16 while saving the rest as fp32. The `--version` value must be set to -1 (preliminary). Only the legacy export...
Add support to tinyllama-1.1B Add support to convert GQA model (learned from https://github.com/ggerganov/llama.cpp/pull/3364) Better run.c - save a little memory(same as #400) - make rope part a function - hardcode...
Is it possible to use Orca2 with this code ? thanks