articulated-object-nerf
articulated-object-nerf copied to clipboard
Questions regarding the AE_ART (CLA-NeRF) training
Hi, I try to train the CLA-NeRF with the following configuration using the data you published:
{
"dataset_name": "sapien_multi",
"root_dir": "data/sapien_single_scene_art",
"exp_name": "sapien_single_scene_articulated",
"exp_type": "vanilla_ae_art",
"img_wh": [320, 240],
"white_back": true,
"batch_size": 1,
"num_gpus": 4
}
-
Based on my understanding, when we are training the model, it should go to
training_step()
of theclass LitNeRF_AE_ART(LitModel)
. However, the program goes directly to thevalidation_step()
. -
In the method
render_rays
ofclass LitNeRF_AE_ART(LitModel)
, it feels like you are collecting some rays here
def render_rays(self, batch, latents):
B = batch["rays_o"].shape[0]
ret = defaultdict(list)
for i in range(0, B, self.hparams.chunk):
batch_chunk = dict()
for k, v in batch.items():
if k=='img_wh' or k =='src_imgs':
continue
if k =='radii':
batch_chunk[k] = v[:, i : i + self.hparams.chunk]
else:
batch_chunk[k] = v[i : i + self.hparams.chunk]
But the key 'radii' doesn't exist in the batch
, so the batch_chunk always goes to the else. Here's the list of keys in the batch
All of those keys not mentioned in the if
statement will go to this line.
Could you fix the code here?
- This is how pytorch lightning goes through its training process, it first goes into validation to do sanity checking for a few steps before going to train. See the docs here
- This choice was made since nerf-factory's original implementation of mipnerf360 and others use
radii
from the dataloader. The current training/datalaoder doesn't step mipnerf360 or others but we can keep it as is if we want to bring it later on. That would entail adding just a datalaoder etc. What do you think?
honestly, since I haven't checked the implementations for art_autodecoder, I am not sure about the compatibility of the existing code. If you use radii
everywhere, I think we could change the dataloader to adapt it. Otherwise, change the model training logic. To summarize, I prefer to make minimal changes to make sure the code works with the existing data.
Does the code not work with the existing data and do you see any errors?
Yes, it reports an error as there's no radii
key in the batch. And this line execute all values with keys not mentioned in the if
statement.
Could you try fix it with the correct logic?
Were you able to look into this and resolve it? Apologies I have not been able to get much time lately due to the ICRA conference push. Please feel free to create a PR if you were able to resolve it locally on your end, thanks!