Awni Hannun

Results 1014 comments of Awni Hannun

Hi @sugatoray sorry I just saw your question: > Are you at all in favor of breaking the function into two? The update config seems less useful for me since...

@sugatoray I'm still waiting for you to let me know when to check this again right?

We have a more featured version of lora in mlx-lm https://github.com/ml-explore/mlx-examples/blob/main/llms/mlx_lm/LORA.md Assuming it doesn’t add much code complexity I think it would be cool to update it to support alternative...

I will wait for this to land and then adopt it here https://github.com/ml-explore/mlx/pull/788

@madroidmaq I think that is accounted for by a race condition that we recently fixed. It should already be fixed on main in MLX.

It should stop at the end of sentence id: https://github.com/ml-explore/mlx-swift-examples/blob/main/Libraries/LLM/Evaluate.swift#L199 The fact that it's not stopping likely means it doesn't have the right EOS token ID set. Which model did...

Looks like this is the eos token for that model: https://huggingface.co/mlx-community/Phi-3-mini-4k-instruct-4bit-no-q-embed/blob/main/tokenizer_config.json#L340. We'll need to check to make sure the IDs match / the tokenizer is reading it correctly.

Right, we changed quantization in MLX core so now the embedding layer is quantized. We'll need to update Swift to do the same.

Those are the commits. Sorry that broke more stuff than I was expecting. Basically the embeddings are default quantized now. So when we quantize for MLX in python the model...

Right, so this is what solves that problem in MLX: https://github.com/ml-explore/mlx-examples/blob/main/llms/mlx_lm/utils.py#L336-L346 It's actually really useful because it handles heterogeneously quantized models very cleanly which is a problem we've had in...