llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

[Draft] Tensor Parallel support to llama.cpp

Open ClarkChin08 opened this issue 1 year ago • 2 comments

  • [x] I have read the contributing guidelines
  • Self-reported review complexity:
    • [ ] Low
    • [ * ] Medium
    • [ ] High Add tensor parallel support to llama.cpp, still draft code now.

ClarkChin08 avatar Sep 26 '24 02:09 ClarkChin08

https://github.com/ggerganov/llama.cpp/issues/9086 Refer to this issue for detailed design.

ClarkChin08 avatar Sep 26 '24 02:09 ClarkChin08

@ClarkChin08 It's great to see this feature is implemented.

Is it possible to update the guide/doc to explain how to use this feature:

  1. how to enable it.
  2. what's the benefit.
  3. which case should use this feature.
  4. update the installation for dependent package (oneCCL, MPI) in oneAPI.

Thank you!

NeoZhangJianyu avatar Sep 26 '24 10:09 NeoZhangJianyu

hello - was this feature completed?

ehartford avatar Jan 07 '25 23:01 ehartford

@ClarkChin08, hello - was this feature completed?

lexasub avatar Jan 12 '25 00:01 lexasub

Hi, thanks and appreciate the work. It would be great to have this feature added/completed, which will bring great performance for multi gpu setup, similar to what vllm already has.

Neko-Box-Coder avatar Apr 19 '25 18:04 Neko-Box-Coder

This looks really interesting! Having tp support like vllm does would bring some great speed ups!

AbdullahMPrograms avatar Apr 30 '25 22:04 AbdullahMPrograms

looking forward to having this feature.

zacksiri avatar May 14 '25 12:05 zacksiri

Just a bump, this feature would be really great for the community.

zacksiri avatar May 31 '25 13:05 zacksiri

I suspect the OP has abandoned development and this feature is incomplete.

aidendle94 avatar Jun 27 '25 22:06 aidendle94

There's actually some much more recent progress on this in https://github.com/ggml-org/llama.cpp/pull/13818 and https://github.com/ggml-org/llama.cpp/pull/13776, but it's not ready yet.

netrunnereve avatar Jun 28 '25 01:06 netrunnereve