llama2.c issues

Modified export.py to add the ability to export fp16 weights.

6

Added a --version 3 to the parameters that is like legacy_export but exports data in fp16 format. ### Warning! needs test. I have too little memory (64GB are needed for...

rdentato

export model to fp16

4

kroggen

Patch to avoid useless logits computation

[This is the same optimization proposed by PR#253 on the new codebase.] I believe the gain is substantial (especially in the case where the user prompt is quite large, like...

rdentato

implementin an Engine to serve the trained model by inferencing

2

see #346

Majdoddin

Extract dataset functionality for easy extensibility

### Extract dataset functionality for easy extensibility Summary of changes: 1. Added `dataset.py` with `Dataset` base class. It encapsulates downloading and iterating over examples in files. There are 3 methods...

alxkolm

Ref: move train files to train directory

1

See https://github.com/karpathy/llama2.c/issues/332

madroidmaq

export and run with bfloat16 weight matrices

Added export method to save matrix data in weights as bfloat16 while saving the rest as fp32. The `--version` value must be set to -1 (preliminary). Only the legacy export...

efocht

Support GQA export, better run.c, Support tinyllama-1.1B

3

Add support to tinyllama-1.1B Add support to convert GQA model (learned from https://github.com/ggerganov/llama.cpp/pull/3364) Better run.c - save a little memory(same as #400) - make rope part a function - hardcode...

magician-blue