TensorRT-LLM icon indicating copy to clipboard operation
TensorRT-LLM copied to clipboard

Unnecessary assertion in cpp implementation of worldConfig.cpp

Open noahnisbet opened this issue 1 year ago • 1 comments

https://github.com/NVIDIA/TensorRT-LLM/blob/3d56a445e8ebf888e78be638faf6beec0a78f3c2/cpp/tensorrt_llm/runtime/worldConfig.cpp#L74

Hi,

I've run into a small bug with the CPP implementation of the runtime code. I am running multi-node inference on Llama2 with pipeline parallelism 2 and tensor parallelism 8. Each node on my system has 8 GPUs. It will not run because there is an assertion that PP size * TP size <= num_GPUs_per_node at the line number I hyperlinked above. I believe that it should be PP size * TP size <= world_size. Maybe I am misunderstanding something... Also, there seems to be no logic allowing you to specify number of GPUs per node in this code.

noahnisbet avatar Feb 01 '24 23:02 noahnisbet

Hi is there any update on this?

noahnisbet avatar Feb 16 '24 18:02 noahnisbet

@MartinMarciniszyn @byshiue Could you please help answer this query?

rgandikota avatar Feb 20 '24 19:02 rgandikota

The assertion is already gone in the main branch.

MartinMarciniszyn avatar Feb 21 '24 10:02 MartinMarciniszyn