mlx-examples icon indicating copy to clipboard operation
mlx-examples copied to clipboard

Architecture Requests for Mamba

Open hg0428 opened this issue 1 year ago • 3 comments

I would like support the following architectures:

  • Mamba
  • MambaByte
  • Mamba-2
  • Mamba-hybrid (mamba + transformer)
  • Mamba-2-hybrid (mamba2 + transformer)

These architectures are becoming quite common now and are supported by most major LLM libraries.

hg0428 avatar Oct 10 '24 14:10 hg0428

We have Mamba in MLX LM already and there is a PR for Mamba 2 (#1009 ).

As for the others, it would be helpful if you could point to Hugging Face repos for each model type. We can consider adding them on an ongoing basis.

awni avatar Oct 10 '24 14:10 awni

We have Mamba in MLX LM already and there is a PR for Mamba 2 (#1009 ).

As for the others, it would be helpful if you could point to Hugging Face repos for each model type. We can consider adding them on an ongoing basis.

Mamba: https://huggingface.co/tiiuae/falcon-mamba-7b Mamba-2: https://huggingface.co/state-spaces/mamba2-2.7b MambaByte: https://huggingface.co/JunxiongWang/MambaByte_Books Mamba-Hybrid: https://huggingface.co/Zyphra/Zamba-7B-v1 Mamba2-Hybrid: https://huggingface.co/Zyphra/Zamba2-2.7B-instruct

hg0428 avatar Oct 10 '24 14:10 hg0428

We have Mamba in MLX LM already and there is a PR for Mamba 2 (#1009 ).

As for the others, it would be helpful if you could point to Hugging Face repos for each model type. We can consider adding them on an ongoing basis.

Zamba2 7b was just released. One of the best models of its size, it outperforms Llama3.2 11b and Mistral 7b in almost every benchmark. It is a Mamba2-hybrid model. https://www.zyphra.com/post/zamba2-7b

hg0428 avatar Oct 15 '24 12:10 hg0428