Awni Hannun comments

Results 1014 comments of


                                            Awni Hannun

Llama-3.1-8B-Instruct-4bit keeps looping at the end.

> Hmm, the output doesn't seem to be any different. Am I supposed to look for something? Nope it would have been obvious if it threw a validation error. Thanks...

Llama-3.1-8B-Instruct-4bit keeps looping at the end.

Good catch regarding the system prompt! The model tokenizer in the HF repo seems to have been misconfigured for some unknown reason. I've fixed it and it should update automatically...

fix: conv_general differences between gpu, cpu

@aturker1 the tests are failing on this PR. Are you planning to come back to it?

fix: conv_general differences between gpu, cpu

Would you mind adding a test for this which exposed the issue in the first place?

Configurable LR schedulers

@chimezie I think we can allow a scheduler config but let's keep it simpler for now. What you've built is very flexible but I think for most cases people don't...

Configurable LR schedulers

> It would be useful to also be able to specify the warmup minimum and have it default to 0 like so: What about; ```yaml warmup: 100 1e-1 ``` If...

How to do the LoRA fine-tuning incrementally

> both of them cannot get the results as expected Either approach is quite reasonable and should work. What happens when you try?

How to do the LoRA fine-tuning incrementally

@southkorea2013 did you ever try mixing old data with new data? I think that should probably solve your issue. I'm closing this as it's not really a bug / specific...

Help with using LoRA adapter weights on a converted Qwen2.5 model in MLX

> I wonder if PEFT produces the same structure of weights for the adaptors? No it doesn't. > You should be able to do LORA training with either mlx-lm (python)...

How to handle application becoming inactive?

Probably the simplest way is to make sure all the outstanding work on any active stream is done which you can do with [synchronize](https://swiftpackageindex.com/ml-explore/mlx-swift/main/documentation/mlx/stream/synchronize()) Then it should be safe to...