localrf icon indicating copy to clipboard operation
localrf copied to clipboard

Train on 360-degrees video?

Open neronicolo opened this issue 1 year ago • 8 comments

Hi,

Thanks for your work. Can we use equirectangular images as a dataset?

neronicolo avatar Jul 07 '23 19:07 neronicolo

Hi, Yes, but without depth and flow losses for now. From initial tests, it also seems to benefit from a higher translation learning rate and skipping frames (if the video is slow-paced):

python localTensoRF/train.py --datadir ${SCENE_DIR} --logdir ${LOG_DIR} --fov 360 --lr_t_init 0.001 --frame_step 4 --loss_depth_weight_inital 0 --loss_flow_weight_inital 0

Please let me know how it goes.

ameuleman avatar Jul 07 '23 20:07 ameuleman

Amazing, thanks!

neronicolo avatar Jul 07 '23 20:07 neronicolo

Hi, the results could be better. The camera path looks off. It should be a straight line since it's a straight street, but it looks like a winding road.

neronicolo avatar Jul 19 '23 19:07 neronicolo

Hi, Initial experiments on 360 videos seemed to work well. Would you mind sharing the video or a frame? A potential issue that comes to mind is that we often get dynamic elements in 360 videos that require masking: we do not handle dynamic objects.

ameuleman avatar Jul 21 '23 02:07 ameuleman

Hi, Sure, here is the link. I've uploaded original video, synthesized video, and camera pose video. python localTensoRF/train.py --datadir ${SCENE_DIR} --logdir ${LOG_DIR} --fov 360 --lr_t_init 0.001 --frame_step 10 --loss_depth_weight_inital 0 --loss_flow_weight_inital 0. There are no dynamic objects. Thanks!

neronicolo avatar Jul 21 '23 17:07 neronicolo

Hi, Thanks. The car is a dynamic element that needs to be masked out. Since it covers a large portion of the frame, it hurts pose estimation severely. Luckily, it is always at the same location in the image, which will make masking easy. Putting the following image in ${SCENE_DIR}/masks should improve results. all

ameuleman avatar Jul 23 '23 16:07 ameuleman

Hi Andreas, I cropped images from the bottom before I started training. If you look at another video I uploaded you will see that car is not visible in the synthesized video. Thanks for the mask tip.

neronicolo avatar Jul 23 '23 17:07 neronicolo

Hi, Cropping the image breaks the model as we are expecting full equirectangular images.

ameuleman avatar Jul 24 '23 00:07 ameuleman