Awni Hannun comments

Results 1014 comments of


                                            Awni Hannun

Support bfloat16 for quantization convert

> I have tried the build mlx from mlx last master branch, but I didn't notice significant performance improvement as stated in #663. I was thinking that it might be...

Support bfloat16 for quantization convert

Can you share the command you ran? I didn't realize there was a scan in the LLM code..

Support bfloat16 for quantization convert

It does look like our [bloat16 scans are commented](https://github.com/ml-explore/mlx/blob/main/mlx/backend/metal/kernels/scan.metal#L454).. not sure why that is.

Support bfloat16 for quantization convert

Oh I see, that is the `cumsum` from the topk sampling. That will be an issue for `bfloat`, one workaround until we figure out why there is no scan for...

Support bfloat16 for quantization convert

@angeloskath or @jagrit06 do you know why bfloat is not supported in the scans? Is that because there are no simd reductions for bfloat?

Enable the Mixtral-like Moe model without the quantized gate layer

Actually, if the size is not 8 won't the gating layer just be quantized? I feel like it would be better to simply quantize the gate layers so we don't...

Enable the Mixtral-like Moe model without the quantized gate layer

> @awni Would you be able to give me some hints as to why we have to use stop_gradient indices Well it's a good question. We could have a gradient...

Enable the Mixtral-like Moe model without the quantized gate layer

Btw, there were some issues with eval'ing in grad graphs in introduced in `0.1.0` 😢 . (Fixed in https://github.com/ml-explore/mlx/pull/612) Are you using the latest MLX for these tests? If so...

Enable the Mixtral-like Moe model without the quantized gate layer

- Fine tuning should still work as it did before, we haven't made any changes (after the bug fix) that would affect that. - No we didn't add gradients for...

Enable the Mixtral-like Moe model without the quantized gate layer

@mzbac a couple requests from you: - Could point me to a command that tries to train a MOE (maybe with just 4 smaller models so I can iterate quickly)?...