Awni Hannun comments

Results 1014 comments of


                                            Awni Hannun

Fix deep recursion with siblings

> The tests seem fast enough for the normal CI as well don't they? I think you're right. I'll add them here instead.

Missing f8 dtypes

> I could just cast the fp8 tensors to f32, but am I right in thinking that If I do that followed by a 8bit quant I would lose a...

Missing f8 dtypes

> Do you think q8 activations could be interesting for performance - maybe even integration into the sdpa functions? With KV cache quantization absolutely! We already have that in mlx-lm...

Missing f8 dtypes

> I was wondering what you thought about keeping the activations quantized, though. For instance, a quantized rms norm, gelu, and so on. Not sure if these ops using quantized...

Missing f8 dtypes

> Isn't this useful for the CUDA backend? Yes very much so. We will likely add fp8.. just waiting for the right time. For Apple silicon it's still not that...

How to get the logits in mlx-lm?

Typically it should be possible to treat the logprobs as if they were logits. Unless you are doing something that relies on the normalization term (which is not so common)....

Implement Weight Normalization, addressing issue #1888

I'm not certain about including this as `WeightNorm` as I thought `WeightNorm` is not used so much anymore.. thoughts? Either way we should not make free functions in C++ and...

Implement Weight Normalization, addressing issue #1888

> regarding layer of course you're right, if you agree regarding usefulness I will happy refactor it fully into mlx.nn as it should have been from the start That would...

Implement Weight Normalization, addressing issue #1888

@cavit99 are you planning coming back to this PR?

Implement Weight Normalization, addressing issue #1888

I'm going to close this as inactive. We're open to revisiting the addition of weight norm in the future.