AutoAWQ icon indicating copy to clipboard operation
AutoAWQ copied to clipboard

Phi-3 mini support?

Open vackosar opened this issue 1 year ago • 6 comments

Not the most powerful, but a useful model:

https://huggingface.co/microsoft/Phi-3-mini-128k-instruct

vackosar avatar Apr 26 '24 04:04 vackosar

Hi @vackosar, I am open to PRs from the community. For now, I will not have time to include it.

casper-hansen avatar Apr 29 '24 10:04 casper-hansen

@casper-hansen thank you. Is there any guidance on how to do that for architecture like Phi-3?

vackosar avatar Apr 29 '24 11:04 vackosar

I checked its architecture and it shouldn't be very hard to implement basic quantization. But its position encoding is special (LongRoPE) and implementing the fusion layer might need more work.

TechxGenus avatar Apr 29 '24 13:04 TechxGenus

It seems that the 4k version is the base and maybe uses standard Rope. That would reduce needed effort. Is there any way to test?

https://huggingface.co/microsoft/Phi-3-mini-4k-instruct/tree/main

vackosar avatar Apr 29 '24 16:04 vackosar

This can easily be ported to AutoAWQ if someone has the time. https://github.com/mit-han-lab/llm-awq/pull/183

casper-hansen avatar May 08 '24 13:05 casper-hansen

Somebody implemented a merge request here: https://github.com/casper-hansen/AutoAWQ/pull/481

But it was not merged yet.

vackosar avatar Jun 03 '24 04:06 vackosar