Phi-3 mini support?
Not the most powerful, but a useful model:
https://huggingface.co/microsoft/Phi-3-mini-128k-instruct
Hi @vackosar, I am open to PRs from the community. For now, I will not have time to include it.
@casper-hansen thank you. Is there any guidance on how to do that for architecture like Phi-3?
I checked its architecture and it shouldn't be very hard to implement basic quantization. But its position encoding is special (LongRoPE) and implementing the fusion layer might need more work.
It seems that the 4k version is the base and maybe uses standard Rope. That would reduce needed effort. Is there any way to test?
https://huggingface.co/microsoft/Phi-3-mini-4k-instruct/tree/main
This can easily be ported to AutoAWQ if someone has the time. https://github.com/mit-han-lab/llm-awq/pull/183
Somebody implemented a merge request here: https://github.com/casper-hansen/AutoAWQ/pull/481
But it was not merged yet.