OpenLRM icon indicating copy to clipboard operation
OpenLRM copied to clipboard

训练需要什么样的显卡配置

Open nevermorewish opened this issue 10 months ago • 4 comments

论文中说训练 我们在128个NVIDIA(40G)A100 GPU上以批次大小1024(每次迭代1024个不同形状)训练LRM 30个周期,大约需要3天时间完成。每个周期包含Objaverse的一份渲染图像数据和MV。 论文里使用的数据集allenai/objaverse 有好几个T

没有这么多a100, 一张a100训练少点数据可以么?

nevermorewish avatar Apr 11 '24 02:04 nevermorewish

您好,一张A100着实可能有些少了,您可以观察一下少数据情况下的收敛情况,但估计可能不太乐观。

ZexinHe avatar Apr 15 '24 13:04 ZexinHe

I was trying to run it on my rtx 1050 for notebooks. Imposible to run even with 1 batch, and all configs to the lower as posible. Always cuda out of memory hahaha!

juanfraherrero avatar Apr 16 '24 01:04 juanfraherrero

Hi @juanfraherrero , since I'm very new to AI, not sure how to properly prepare data and run training. I've posted my question on this issue. Could you please check it when it possible? Thank you in advance!

hayoung-jeremy avatar Apr 16 '24 05:04 hayoung-jeremy

Hi @juanfraherrero,

You can try decreasing the frame_size here and see if it still throws OOM error. If it still doesn't work, then I guess RTX1050 is not enough to do inference :(

ZexinHe avatar May 06 '24 10:05 ZexinHe