Awni Hannun
Awni Hannun
> Hmm, the output doesn't seem to be any different. Am I supposed to look for something? Nope it would have been obvious if it threw a validation error. Thanks...
Good catch regarding the system prompt! The model tokenizer in the HF repo seems to have been misconfigured for some unknown reason. I've fixed it and it should update automatically...
@aturker1 the tests are failing on this PR. Are you planning to come back to it?
Would you mind adding a test for this which exposed the issue in the first place?
@chimezie I think we can allow a scheduler config but let's keep it simpler for now. What you've built is very flexible but I think for most cases people don't...
> It would be useful to also be able to specify the warmup minimum and have it default to 0 like so: What about; ```yaml warmup: 100 1e-1 ``` If...
> both of them cannot get the results as expected Either approach is quite reasonable and should work. What happens when you try?
@southkorea2013 did you ever try mixing old data with new data? I think that should probably solve your issue. I'm closing this as it's not really a bug / specific...
> I wonder if PEFT produces the same structure of weights for the adaptors? No it doesn't. > You should be able to do LORA training with either mlx-lm (python)...
Probably the simplest way is to make sure all the outstanding work on any active stream is done which you can do with [synchronize](https://swiftpackageindex.com/ml-explore/mlx-swift/main/documentation/mlx/stream/synchronize()) Then it should be safe to...