nerfacc icon indicating copy to clipboard operation
nerfacc copied to clipboard

Why results of early_stop_eps=1 are better than early_stop_eps=1e-3?

Open hengfei-wang opened this issue 1 year ago • 4 comments

Hi,

I tested on my datasets. Then I found the network can learn better when early_stop_eps=1 than early_stop_eps=1e-3 in ray_marching function, which makes no sense to me. It seems like the network is hard to train when early_stop_eps=1e-3 since it preserves more samples? I am confused by this result.

hengfei-wang avatar Oct 29 '22 16:10 hengfei-wang

Hi,

It is not entirely impossible actually.

early_stop_eps=1 will give you maximum one non-zero-density sample for each ray. As it will cull out most of the samples, using the default training script you will have a very large batch size of rays (we keep the total number of samples roughly constant during training iterations).

The model can still optimize it this setting. It's like you are injecting a prior knowledge there is only one valid point along each ray.

As for early_stop_eps=1e-3 yeah it will keep a lot of samples so the number of rays being used during training would be much less. Less rays usually means the optimization process is more noisy and slower to converge. So sometimes you have to do tradeoff between # samples per ray v.s. # rays per iteration to get the best performance within a certain amount of training iterations.

liruilong940607 avatar Oct 30 '22 00:10 liruilong940607

Hi,

Thank you for your quick reply. Actually, I didn't use the default training script. In my setting, I only sampled 64x64 rays, so the num of sampled rays is constant. I expect to get better performance when I choose a smaller early_stop_eps since it will sample more points. But it doesn't work. That's why I feel confused.

hengfei-wang avatar Oct 30 '22 14:10 hengfei-wang

I inserted ray_marching and rendering function into my code.

hengfei-wang avatar Oct 30 '22 15:10 hengfei-wang

Not quire sure why you have results like that. If you set early_stop_eps = 0 and disable occupancy grid, you should get identical results with not using nerfacc, as all samples are preserved.

And starting from that, if you change to early_step_eps=1e-3, you should observe minor differences.

Btw, the default early_stop_eps in the rendering function is 1e-4, not 1e-3. Maybe you are messing around with render_step_size=1e-3 ?

Also if the sigma_fn is not functionally correct it might also cause this weird behavior.

liruilong940607 avatar Nov 01 '22 07:11 liruilong940607