llama.cpp
llama.cpp copied to clipboard
[Review] Merge PowerInfer with llama.cpp mainline
Writing a review of PowerInfer with a view to merge into llama.cpp.
References:
- https://github.com/SJTU-IPADS/PowerInfer
- https://ipads.se.sjtu.edu.cn/_media/publications/powerinfer-20231219.pdf
Other discussions:
- https://news.ycombinator.com/item?id=38701822
- https://www.reddit.com/r/LocalLLaMA/comments/18luk10/wait_llama_and_falcon_are_also_moe/
- https://twitter.com/omarsar0/status/1737168751668187229
Very nice. Just 26 commits. Of course there are some conflicts.
That makes a lot of sense. Let me know what are the results that we need to verify from the paper before working on merging it. Pointers to specific sections/figures would help.
I think it'll be great if we can ensure llama.cpp remains mainline for inference research work.
I've written about it here: https://github.com/ggerganov/llama.cpp/discussions/4534#discussioncomment-7900305
Got it. Read through it. I'll let you guys discuss and let me know if it's worth working on this.
MUST, HAVE, SPEEEEEEED
Any news on this ?