Awni Hannun comments

Results 1014 comments of


                                            Awni Hannun

chore(mlx-lm): enable to apply default chat template

Of course, sounds good to me! Thank you!

chore(mlx-lm): enable to apply default chat template

@Blaizzy just to understand your concern a bit. This PR has changed to give another option to the user to basically force use the default template if the model doesn't...

Enable more BERT models

Really nice! When you say RoBERTA is not working is it giving bad output or just crashing?

apple: support for MLX quantized linear in diffusers

Diffusers MLX back-end? 👀 That would be very cool

[BUG] ValueError: [quantize] The last dimension of the matrix needs to be divisible by the quantization group size 64.

It's not a bug.. at the risk of being redundant, the last dimension of the matrix has to be divisible by the quantization group size. For the size 4304 there...

[BUG] ValueError: [quantize] The last dimension of the matrix needs to be divisible by the quantization group size 64.

You can use `class_predicate` for that. Just put the condition you want in the predicate. For example if you are trying to skip weights of a certain shape: `class_predicate =...

Experiment with medium machines for CI

@madrob can we close this or are you still investigating?

Add GPU implementation of QR factorization [wip]

Did you manually make a command encoder from the command buffer? MLX manages an active command encoder so you should not make it directly. Rather call the `device.get_command_encoder()` to get...

Add GPU implementation of QR factorization [wip]

@nicolov are you planning to come back to this?

[Feature] Cholesky decomposition

That's pretty awesome that it's faster. A rare example of mixing CPU / GPU speeding things up! I'm not sure what to do with it. On the one-hand, it's a...