lorax docs: Clarify multi-gpu usage

docs: Clarify multi-gpu usage

Open tgaddair opened this issue 1 year ago • 1 comments

trafficstars

Using --gpus all for docker run also requires --sharded or --gpus N to be set for LoRAX, but this isn't made clear. We should add something in the docs about GPUs and using multi-GPU.

Feb 15 '24 05:02 tgaddair

Also, should add some docs explaining tensor parallelism, and when it makes sense to use multi-GPU. Specifically at least one of:

Model is too big for one GPU
GPUs are connected via NVLink

Otherwise the network overhead of GPU-to-GPU communication will be the main bottleneck.

Feb 17 '24 05:02 tgaddair

lorax lorax copied to clipboard

docs: Clarify multi-gpu usage

lorax
lorax copied to clipboard