Andrej comments

Results 373 comments of


                                            Andrej

Need train_multi.py example to show use with multiple input files

When you're training you don't care about special tokens usually, to create training data you'd directly insert the token ids as integers in between documents instead of changing the text...

Simple GPU support, implemented by OpenGL

This is really awesome but I don't think I can take on its maintanance in this repo. I'm very happy to link to your work from the README file in...

export model to fp16

Question: what is the benefit of fp16? - As the Llama 2 models were trained in bf16 I find fp16 conversion sketchy. For newly trained models this is less of...

Adding LoRA fine tuning

I like where this is going, but this looks like multiple PRs in one, and a little bit of sus code. I'll inline comment

Is this project still active?

Yeah I don't have too much time right now for this repo. Please link to any PRs that you consider no-brainers, happy to take a look. Maybe I should merge...

Is this project still active?

I see. It is currently a whole separate file runq.c. Which I don't love, but also don't really see any real way around. Let me re-load my RAM again with...

Is this project still active?

@KangkangStu did you follow instructions here? https://github.com/karpathy/llama2.c#int8-quantization

convert ckpt.pt to huggingface model

In principle absolutely. In practice noone has submitted a PR to export it and I don't personally care as much, so I've been ignoring it :) Would probably accept the...

Code Llama rope_theta parameter

yep exactly, the v1+ header is large enough to incorporate additional hyperparameters like this.

Suggestion: Is it possible to reorganize the file structure

you're not wrong...