is it possible to run 2 instance with 2 x 4090
As title, is it possible,
I have one python app processing a list of json data right now I am only using one card cause it has 24 gb already.
I guess multi_gpus is only for smaller gpus, am I correct?
Multi-cards not only allow smaller cards to be linked together and used together, but also increase computing speeds.
won't split the weight decrease the perf? what if I allocate 20 GB for each gpu,
For single image input, 24GB is enough for 1 instance, but for multi-images or video you may need 2x24GB using multi-gpus deployment like web_demo_2.6.py#L44-L64 .
Multi-cards not only allow smaller cards to be linked together and used together, but also increase computing speeds.
multi-cards used in web_demo_2.6.py#L44-L64 would not speed up computing, because the computation is performed serially from GPU 0 to GPU 1. While vllm with tensor-parallel would speed up computing.