TensorRT-LLM icon indicating copy to clipboard operation
TensorRT-LLM copied to clipboard

fix: correct cudaSetDevice error when GPUs per node are fewer than their ranks in inter-node inference

Open littlefatfat opened this issue 1 year ago • 1 comments

#1494

littlefatfat avatar Apr 24 '24 10:04 littlefatfat

I am not in favor of having function parameter defaults that change depending on the environment. These should be compile time constants. I suggest changing run.py instead so that it passes the correct number of devices per node into the bindings.

MartinMarciniszyn avatar May 16 '24 07:05 MartinMarciniszyn