mlx
mlx copied to clipboard
MLX: An array framework for Apple silicon
Some of the most popular models provide weights in bfloat16, which unfortunately can not load on CPU because `Matmul::eval_cpu` only supports float32. I know CPU support is not on priority,...
**Describe the bug** SDFA currently does not support head dimension outside of 64, 96, and 128. **To Reproduce** Fused attention falls back to regular operation when head dimension is not...
In pytorch, the following is easily possible: ```python logits = ... probs = Categorical(logits=logits) log_prob = probs.log_prob(value) entropy = probs.entropy() ``` but when I want to achieve something similar in...
@awni We were going through pooling layers available in mlx framework. I think we can add pooling techniques for 3d data such as Avgpooling3d, Maxpooling3d and also implement AdaptivePooling for...
I was thinking to add https://pytorch.org/docs/stable/_modules/torch/nn/init.html#dirac_ initialization method. It can be a useful feature for MLX.
**Describe the bug** This is related to the #1499 , the GRUCell is not implemented and the GRU version is also not optimized on MPS compare to pytorch and tensorflow...
This is somewhat related to https://github.com/ml-explore/mlx/issues/12, although I can see how `mlx` improves significantly upon `torch`. My question is why reinvent the wheel with `mlx`, when the core of `mlx`...
Seems the default mlx weight initialization is incorrect. Specifically github mlx/python/mlx/nn/layers/linear.py and other similar references. Generally the starting point for weight init is a normal distribution with mean = zero...
This PR is split from https://github.com/ml-explore/mlx/pull/1983. This PR implements unary ops for CUDA backend. * A `cucomplex_math.cuh` file is added to implement arithmetic operators for cuComplex. * In `fp16_math.cuh` there...
## Proposed changes Please include a description of the problem or feature this PR is addressing. If there is a corresponding issue, include the issue #2155. Fixes #2155 ## Checklist...