CogVLM
CogVLM copied to clipboard
Why is it that running inference on two cards is slower than running on a single card?
cogvlm-chat-v1.1 model H800 machine Why is it that running inference on two cards is slower than running on a single card?"