gaussian-splatting icon indicating copy to clipboard operation
gaussian-splatting copied to clipboard

Provide --data_device option to put data on CPU to save VRAM for training

Open HrsPythonix opened this issue 1 year ago • 10 comments

when there are many training images such as in large scene, most of the VRAM are used to store training data, use --data_on_cpu can help reduce VRAM and make it possible to train on GPU with less VRAM

HrsPythonix avatar Jul 11 '23 10:07 HrsPythonix

+1 to this! Was just about to implement this myself :)

JonathonLuiten avatar Jul 11 '23 22:07 JonathonLuiten

It's just a bit on the ugly side, do you have time to actually pass an argument in the constructor giving the device instead of the if/elses and if nothing is provided go for default cuda

grgkopanas avatar Jul 11 '23 22:07 grgkopanas

also can you provide some information on how this is tested? its a bit scary change given that we always assumed that everything lies on the GPU so I am not sure if we pass them to the GPU at appropriate times

grgkopanas avatar Jul 11 '23 22:07 grgkopanas

It's just a bit on the ugly side, do you have time to actually pass an argument in the constructor giving the device instead of the if/elses and if nothing is provided go for default cuda

Do you mean argument such as --data_device cpu or --data_device gpu, I assume since there are only two kinds of devices, use a flag might be the same.

In my thought, this PR's is meant to provide an option for user to choose whether put their data on cpu instead of gpu, if you do not specify this flag(by default) the behavior will be the same with previous implementation.

As for testing, I checked the process to retrieve gt images during training, I found current implementation call cuda() after the retrieval as in train.py line 79, so put the original_image in the cpu should be fine. Also I've finished two training after this modification on my machine, which works fine with large amount of images(1000+ images) with only a few GB VRAM usage.

# Loss
gt_image = viewpoint_cam.original_image.cuda()

HrsPythonix avatar Jul 12 '23 02:07 HrsPythonix

But you are right, use a device argument should be more reasonable, maybe for users with less VRAM and RAM to put their images, specify the device as disk might be an option

HrsPythonix avatar Jul 12 '23 02:07 HrsPythonix

I mean that we should pass an argument, arg_device to the constructor that is either torch.device or a string "cpu" or "cuda" which later can be used like this:

torch.tensor(array, device=arg_device)

If it's not clear that's alright, I can definitely do it.

Regarding testing it seems that you are covered, did you notice any degradation regarding speed?

Best, George

On Tue, Jul 11, 2023, 19:16 Pythonix Huang @.***> wrote:

But you are right, use a device argument should be more reasonable, maybe for users with less VRAM and RAM to put their images, specify the device as disk might be an option

— Reply to this email directly, view it on GitHub https://github.com/graphdeco-inria/gaussian-splatting/pull/14#issuecomment-1631747744, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACGXXYONBLBMV4HFUTQ5LHDXPYCGDANCNFSM6AAAAAA2FYZFVE . You are receiving this because you commented.Message ID: @.***>

grgkopanas avatar Jul 12 '23 02:07 grgkopanas

I mean that we should pass an argument, arg_device to the constructor that is either torch.device or a string "cpu" or "cuda" which later can be used like this: torch.tensor(array, device=arg_device) If it's not clear that's alright, I can definitely do it. Regarding testing it seems that you are covered, did you notice any degradation regarding speed? Best, George On Tue, Jul 11, 2023, 19:16 Pythonix Huang @.> wrote: But you are right, use a device argument should be more reasonable, maybe for users with less VRAM and RAM to put their images, specify the device as disk might be an option — Reply to this email directly, view it on GitHub <#14 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACGXXYONBLBMV4HFUTQ5LHDXPYCGDANCNFSM6AAAAAA2FYZFVE . You are receiving this because you commented.Message ID: @.>

ok, I get it, I will make new commit in this pr

As for speed, training is slightly slower, not very obvious

HrsPythonix avatar Jul 12 '23 02:07 HrsPythonix

Thank you, we really appreciate the effort and your help!

On Tue, Jul 11, 2023 at 7:26 PM Pythonix Huang @.***> wrote:

I mean that we should pass an argument, arg_device to the constructor that is either torch.device or a string "cpu" or "cuda" which later can be used like this: torch.tensor(array, device=arg_device) If it's not clear that's alright, I can definitely do it. Regarding testing it seems that you are covered, did you notice any degradation regarding speed? Best, George … <#m_5600741950351066851_> On Tue, Jul 11, 2023, 19:16 Pythonix Huang @.> wrote: But you are right, use a device argument should be more reasonable, maybe for users with less VRAM and RAM to put their images, specify the device as disk might be an option — Reply to this email directly, view it on GitHub <#14 (comment) https://github.com/graphdeco-inria/gaussian-splatting/pull/14#issuecomment-1631747744>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACGXXYONBLBMV4HFUTQ5LHDXPYCGDANCNFSM6AAAAAA2FYZFVE https://github.com/notifications/unsubscribe-auth/ACGXXYONBLBMV4HFUTQ5LHDXPYCGDANCNFSM6AAAAAA2FYZFVE . You are receiving this because you commented.Message ID: @.>

ok, I get it, I will make new commit in this pr

As for speed, training is slightly slower, not very obvious

— Reply to this email directly, view it on GitHub https://github.com/graphdeco-inria/gaussian-splatting/pull/14#issuecomment-1631754209, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACGXXYI4PTWMNVSE4J5SY4DXPYDN5ANCNFSM6AAAAAA2FYZFVE . You are receiving this because you commented.Message ID: @.***>

grgkopanas avatar Jul 12 '23 02:07 grgkopanas

Thank you, we really appreciate the effort and your help! On Tue, Jul 11, 2023 at 7:26 PM Pythonix Huang @.> wrote: I mean that we should pass an argument, arg_device to the constructor that is either torch.device or a string "cpu" or "cuda" which later can be used like this: torch.tensor(array, device=arg_device) If it's not clear that's alright, I can definitely do it. Regarding testing it seems that you are covered, did you notice any degradation regarding speed? Best, George … <#m_5600741950351066851_> On Tue, Jul 11, 2023, 19:16 Pythonix Huang @.> wrote: But you are right, use a device argument should be more reasonable, maybe for users with less VRAM and RAM to put their images, specify the device as disk might be an option — Reply to this email directly, view it on GitHub <#14 (comment) <#14 (comment)>>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACGXXYONBLBMV4HFUTQ5LHDXPYCGDANCNFSM6AAAAAA2FYZFVE https://github.com/notifications/unsubscribe-auth/ACGXXYONBLBMV4HFUTQ5LHDXPYCGDANCNFSM6AAAAAA2FYZFVE . You are receiving this because you commented.Message ID: @.> ok, I get it, I will make new commit in this pr As for speed, training is slightly slower, not very obvious — Reply to this email directly, view it on GitHub <#14 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACGXXYI4PTWMNVSE4J5SY4DXPYDN5ANCNFSM6AAAAAA2FYZFVE . You are receiving this because you commented.Message ID: @.**>

Done changing --data_on_cpu to --data_device

HrsPythonix avatar Jul 12 '23 03:07 HrsPythonix

Currently, on my A10 machine, the training speed will reduce from 13.02it/s to 11.87it/s when using CPU to hold data

HrsPythonix avatar Jul 12 '23 03:07 HrsPythonix

Seems alright to me, merged

grgkopanas avatar Jul 12 '23 18:07 grgkopanas