TensorRT-LLM Unnecessary assertion in cpp implementation of worldConfig.cpp

Unnecessary assertion in cpp implementation of worldConfig.cpp

Open noahnisbet opened this issue 1 year ago • 1 comments

https://github.com/NVIDIA/TensorRT-LLM/blob/3d56a445e8ebf888e78be638faf6beec0a78f3c2/cpp/tensorrt_llm/runtime/worldConfig.cpp#L74

Hi,

I've run into a small bug with the CPP implementation of the runtime code. I am running multi-node inference on Llama2 with pipeline parallelism 2 and tensor parallelism 8. Each node on my system has 8 GPUs. It will not run because there is an assertion that PP size * TP size <= num_GPUs_per_node at the line number I hyperlinked above. I believe that it should be PP size * TP size <= world_size. Maybe I am misunderstanding something... Also, there seems to be no logic allowing you to specify number of GPUs per node in this code.

Feb 01 '24 23:02 noahnisbet

Hi is there any update on this?

Feb 16 '24 18:02 noahnisbet

@MartinMarciniszyn @byshiue Could you please help answer this query?

Feb 20 '24 19:02 rgandikota

The assertion is already gone in the main branch.

Feb 21 '24 10:02 MartinMarciniszyn

TensorRT-LLM TensorRT-LLM copied to clipboard

Unnecessary assertion in cpp implementation of worldConfig.cpp

TensorRT-LLM
TensorRT-LLM copied to clipboard