Awni Hannun

Results 1014 comments of Awni Hannun

@Blucknote There are pre-converted quantized models in the MLX Hugging Face community: https://huggingface.co/mlx-community Also, all of the conversion scripts in the [LLM examples](https://github.com/ml-explore/mlx-examples/tree/main/llms) can produce quantized models

That is not expected, it sounds like a bug. Thanks for reporting, I will take a look.

Yes we are very much aware of this issue. Working with @angeloskath on a fix.

Thanks @singhaki for the input. Regarding the rope_traditional flag, I wouldn't remove it as the model won't work as well. We can add the config param to our model so...

It’s probably using too much memory. Read the [section in the readme](https://github.com/ml-explore/mlx-examples/tree/main/lora#memory-issues) on how to reduce memory use. If it’s still super slow your machine may not have enough memory...

Wow that's odd. Did you happen to do anything on your computer around the time that it slowed down? It's possible the GPU got used by something else? Also you...