Awni Hannun

Results 1014 comments of Awni Hannun

Yea these #s don't make sense. Seems likely there is some unexpected bottleneck in the codes you tested. Would you be up to try running [this example](https://github.com/ml-explore/mlx-examples/tree/main/transformer_lm). In previous benchmarks...

Ok, this is probably good news. It really suggests there is some strange perf bottleneck in the two examples. It could be the convolutions (but that seems unlikely for the...

@jagrit06 for the speech kwt example this size matmul comes up and we are really slow compared to MPS on it (about 3x I think) ``` compare_filtered("matmul --size 64x25344 --size...

@SarthakYadav we also found a pretty sever performance cliff with one of our reduction kernels. I think fixing the matmul and the reduction for those cases should make the MLX...

> I suppose the performance issue with the reduction kernel explains the ResNet slowdown as well? I think it's part of it. I haven't looked at that benchmark as carefully...

The number you have for MLX are way too slow, something looks off there. For those parameters on my machine (M1 Max 32GB) MLX is twice as fast: MLX: ```...

That would be awesome, please add them (+ tests / docs) if you can. We'd love to take a PR for that. We mostly follow the PyTorch nn API so...

Cool! I'm not opposed to adding some of these, but It's also not a big priority as they are not used that much. If you are interested in contributing them...

This is interesting: ``` x = mx.array(1.0) y = mx.arcsin(mx.sin(x)) print(y > 1.0) # Evaluates to True ``` In infinite precision it should give 1.0 precisely, but it does not....

Dup of #16 (I'll close 16 in favor of discussing here). @dc-dc-dc yes I think onnx support would be great! Regarding support directly in mlx vs a separate repo /...