Awni Hannun comments

Results 1014 comments of


                                            Awni Hannun

[Feature Request] Full-Tuning Example

Sure, we can add phi to our LoRA example! @mzbac already did some great work to merge Phi into the generation example. So from there it should be pretty straightforward....

[Feature Request] Full-Tuning Example

> I would love this. I try unfreezing the model but that just leads to NaN loss. @mzbac makes good points. LoRA fine-tuning is much more stable for these large...

[Feature Request] Full-Tuning Example

You can do something like: ```python module.update(tree_map(lambda p: p.astype(mx.float32), module.parameters())) ```

[Feature Request] Full-Tuning Example

It shouldn't be 1/10th.. that probably means it's swapping :\. Unfortunately, fine-tuning in 32-bit precision is very memory hungry.. it's uncommon to use 32-bit even for pre-training with such large...

[Feature Request] Full-Tuning Example

People do float16 and bfloat16 but both cases (typically) require modifications to actually make full training work. bfloat16 is easier than float16, but it sill often won't work with a...

[Feature Request] Full-Tuning Example

> I tried converting all the weights after the model is created to float16, but that didn't work. What exactly do you mean by "didn't work"? In general that should...

[Feature Request] Full-Tuning Example

@danilopeixoto does this command: ``` python -m mlx_lm.lora --train --model models/mixtral-8x7b-v0.1-8bit-64g/ --data datasets/chat-instruct/ --steps-per-report 1 --steps-per-eval 15 --save-every 15 --iters 500 --lora-layers 16 --batch-size 2 ``` Still produce NaN for...