Awni Hannun comments

Results 1014 comments of


                                            Awni Hannun

[Performance] PyTorch (MPS) is faster than MLX in backward of convolution layer

Thanks for the benchmarks everyone! There is clearly an unexpected performance cliff on M1 machines here as MLX is substantially faster on M2+. We'll need to take a deeper look...

[Performance] PyTorch (MPS) is faster than MLX in backward of convolution layer

@arnold-yan. I took a look at this benchmark. The performance issue turns out to be from the gradient of the second call to `nn.Upsample`. It uses nearest neighbor interpolation by...

[Performance] PyTorch (MPS) is faster than MLX in backward of convolution layer

@arnold-yan you're right the benchmark is slower with linear 😓 . I had a mistake. Let me keep digging.

[Performance] PyTorch (MPS) is faster than MLX in backward of convolution layer

Hi @arnold-yan https://github.com/ml-explore/mlx/pull/1541 should improve your benchmark a lot. I ran it on an M1 Max and M3 Max and the numbers are now: Machine | MLX | PT ----...

[Feature Request] Cannot create tensor from raw bytes + dtypes

Can you give an example of what you mean / how that would look? As far as I understand the Python buffer protocol does not support bfloat16. E.g. `memoryview(jnp.ones((2, 2),...

[Feature Request] Cannot create tensor from raw bytes + dtypes

@Narsil I'm still not fully understanding what API you are looking for / what's missing? Right now you can create an array from a Python memoryview object which should be...

Distributed layers

I kind of like this design. I like that it's all quite simple and easy to follow and we have a lot of control over how to shard the model...

Distributed layers

> The sharding functions now also take a groups argument. This assumes the linear layer is a fused one and splits it according to the groups argument (evenly or percentage...

Distributed layers

> Otherwise one node will get all the queries and no keys and so on. Ah that makes sense now. Some suggestions on alternative names: - shards - segments -...

feat: add cross_product

@NripeshN are you planning to come back to this?