Awni Hannun comments

Results 1014 comments of


                                            Awni Hannun

Additional parameters to mlx_lm lora? r, lora_alpha, lora_dropout, scale?

Yes that's been on our list to add for a while. Though, do you use `mlx_lm.lora` or just the original `lora.py` script? My preference is to update the package since...

Additional parameters to mlx_lm lora? r, lora_alpha, lora_dropout, scale?

Agree I think we may need to start using a yaml config

Additional parameters to mlx_lm lora? r, lora_alpha, lora_dropout, scale?

Yes indeed.. safe to close, thank you!

Issue with fetch_from_hub tokenizer for fine-tuned models

I actually just tried to use convert on the model and I got this issue `python -m mlx_lm.convert --hf-path bigcode/starcoder2-3b`: ``` File "/Users/awni/mlx-examples/llms/mlx_lm/utils.py", line 413, in fetch_from_hub config = AutoConfig.from_pretrained(model_path)...

Issue with fetch_from_hub tokenizer for fine-tuned models

Fix is here https://github.com/ml-explore/mlx-examples/pull/574

Reinforcement Learning from Human Feedback (RLHF) examples: Direct Preference Optimization (DPO)

@danilopeixoto I've been thinking about having this in MLX LM recently. Any interest in sending a PR? It might make to do it after we have a more manageable config...

Reinforcement Learning from Human Feedback (RLHF) examples: Direct Preference Optimization (DPO)

To be more concrete, I'm envisioning you just set the loss in the config. e.g. `cross_entropy` or `dpo`

GaLore process on Apple Silicon?

Should be ready soon https://github.com/ml-explore/mlx/pull/809, although that will only run on the CPU so it may be too slow depending on how often you use it.

Support bfloat16 for quantization convert

I'm not opposed to supporting bfloat but I would not want to make it the default: - float16 is still considerably faster given it has native support. The benchmarks in...

GaLore process on Apple Silicon?

> I am wondering that there are any documents about how to accelerate SVD by GPU? I would start by learning about parallel implementations of SVD in general (maybe try...