Awni Hannun
Awni Hannun
> Hmm I see, one workaround would require running a first kernel to count the number of values that verify the condition, then allocate the appropriate memory and then running...
I think this is from the fact that we only display a few digits. If you want to see the full precision output: ``` c = mx.array([-2.5, -1.5, -0.5, 0.5,...
PS if having higher precision printing is important to you feel free to open an issue. We have a PR already to work on formatting of outputs, so that is...
I will reopen it and change the title.
This isn't a bug. The C++ API is a little hard to use and undocumented, so sorry you ran into that issue. For the C++ scatter API, the following must...
Would be great for both cases if you could share the code with the timing you used to be sure we are all measuring the same things!
@sck-at-ucy could you say a bit more about what it was doing when it didn't work (i.e. going beyond 5k)? I notice you added: ``` for step in range(max_steps): if...
We'd like to get better at small sizes too. Thanks for the detailed benchmark! Perf is a high priority right now and more benchmarks to examine are very helpful! I'll...
Actually this one is not so easy to fix as we use the `simd` instructions in Metal. The `NaN` are being suppressed there.
Since we switched to nanobind, the stubs in `site-packages/mlx/core/__init__.pyi` should have the right type information. There is also an open issue (#1240) about getting complete typing info for MLX python...