lorax icon indicating copy to clipboard operation
lorax copied to clipboard

docs: Clarify multi-gpu usage

Open tgaddair opened this issue 1 year ago • 1 comments
trafficstars

Using --gpus all for docker run also requires --sharded or --gpus N to be set for LoRAX, but this isn't made clear. We should add something in the docs about GPUs and using multi-GPU.

tgaddair avatar Feb 15 '24 05:02 tgaddair

Also, should add some docs explaining tensor parallelism, and when it makes sense to use multi-GPU. Specifically at least one of:

  • Model is too big for one GPU
  • GPUs are connected via NVLink

Otherwise the network overhead of GPU-to-GPU communication will be the main bottleneck.

tgaddair avatar Feb 17 '24 05:02 tgaddair