DevKiD
DevKiD
Did you restart the services?
Can I have the full log?
I would also add a small benchmark test to see how to split the model among the devices instead of splitting it equally like I understand. I would like to...
Then I don't see a problem. Probably I misunderstood something. By TP a layer or multiple layer is splitted across devices. But how are they splitted across those devices? I...
Is there also something hybrid like powerful devices are taking layers and less powerful devices doing it by TP?
For Pipeline I see by using a 4gb model a rise of 4gb in ram usage which is strange. Why is ist not 1.3gb per device bye 3 device?
I will create a recording tomorrow. If it happens again I will upload.
Back to the original. Let a model block have 82 Neurons per layer then every device (3 devices) should get about 27 and one 28. To make this easy we...
I still don't understand the problem. Can you @rltakashige create a recording so that others and I can understand?
I made a thinking error. Unbalance load can lead to performance loss and to a bottleneck. The smart padding might be intelligent. But instead of giving paddings we can compute...