mlx
mlx copied to clipboard
MLX: An array framework for Apple silicon
```python import mlx.core as mx a = mx.random.uniform(shape=(1024, 1024, 1024, 3)) mx.eval(a) ``` Fails with: ``` RuntimeError: cudaGraphAddKernelNode(&node, graph_, NULL, 0, ¶ms) failed: invalid argument ```
``` python python/tests/test_conv.py TestConv.test_torch_conv_2D ``` It hangs in one of the grouped conv tests.
``` python python/tests/test_blas.py -v ``` A bunch of failures: ```python test_matmul_shapes (__main__.TestBlas.test_matmul_shapes) ... test_matmul_shapes (__main__.TestBlas.test_matmul_shapes) (dtype='float32', shape_a=(1, 2, 1), shape_b=(1, 1, 1), transpose='nn') ... FAIL test_matmul_shapes (__main__.TestBlas.test_matmul_shapes) (dtype='float32', shape_a=(1, 2,...
**Describe the bug** A clear and concise description of what the bug is. When profliing gptoss models [add_profiling_suppport](https://github.com/ml-explore/mlx-lm/pull/601), the process of profiling prefill becomes extremely slow, and finally throw timed...
Hi MLX Team, Thank you for developing such an outstanding package! I’ve been using **MLX** recently and noticed that the function **`mlx.linalg.svd`** currently supports `float32` and `float64`, while **`mlx.linalg.eig`** only...
```python import mlx.core as mx def fun(): for _ in range(1000): mx.random.randint(1, 10) fun() print(mx.random.randint(0, 10, shape=(32, 32))) ``` Evaluating the last line causes 1k split kernels to run since...
Hi, I'm wondering if anyone is working on implementing metal kernels for sparse matrix multiplication. I'd like to try implementing this myself, but want to make sure the community would...
## Proposed changes Extended dtype support for `mlx.linalg.svd` and `mlx.linalg.eig` as requested in #2708. **Changes:** - Added `float64` support for `mlx.linalg.eig` (CPU) - Added `complex64` support for `mlx.linalg.svd` (CPU) -...
## Proposed changes - `array` class prefers `int64_t` instead of `size_t` - `SmallVector` is inherently small -- sizes are now `int` - propagate the signedness through the codebase and fix...
Hi @awni , I've been working on a quantizable Conv2D layer that is a dropin replacement for Conv2D (for a large conv unet ~5GB with self attention, cross attention and...