AutoAWQ Phi-3 mini support?

Not the most powerful, but a useful model:

https://huggingface.co/microsoft/Phi-3-mini-128k-instruct

Apr 26 '24 04:04 vackosar

Hi @vackosar, I am open to PRs from the community. For now, I will not have time to include it.

Apr 29 '24 10:04 casper-hansen

@casper-hansen thank you. Is there any guidance on how to do that for architecture like Phi-3?

Apr 29 '24 11:04 vackosar

I checked its architecture and it shouldn't be very hard to implement basic quantization. But its position encoding is special (LongRoPE) and implementing the fusion layer might need more work.

Apr 29 '24 13:04 TechxGenus

It seems that the 4k version is the base and maybe uses standard Rope. That would reduce needed effort. Is there any way to test?

https://huggingface.co/microsoft/Phi-3-mini-4k-instruct/tree/main

Apr 29 '24 16:04 vackosar

This can easily be ported to AutoAWQ if someone has the time. https://github.com/mit-han-lab/llm-awq/pull/183

May 08 '24 13:05 casper-hansen

Somebody implemented a merge request here: https://github.com/casper-hansen/AutoAWQ/pull/481

But it was not merged yet.

Jun 03 '24 04:06 vackosar