viser icon indicating copy to clipboard operation
viser copied to clipboard

can not train on davis car turn

Open VisonSpace opened this issue 3 years ago • 12 comments

can not train on car turn have pdb tracing bug how to solve it?

VisonSpace avatar Jan 11 '22 02:01 VisonSpace

why train on a subset of video(id 22-id42 in the breakdance-flare)? why not train with the whole video

VisonSpace avatar Jan 11 '22 02:01 VisonSpace

How is the data pre-processed? The pre-processing script hasn't been added yet.

why train on a subset of video(id 22-id42 in the breakdance-flare)?

We start from random root poses (equivalent to camera pose). Training on all videos does not always produce correct root pose of the dancer, possibly due to limited batch size.

gengshan-y avatar Jan 11 '22 03:01 gengshan-y

Does the model only handle the situation when the camera is not moving?

VisonSpace avatar Jan 11 '22 03:01 VisonSpace

In this work, moving camera is treated as static camera + moving root body.

gengshan-y avatar Jan 11 '22 03:01 gengshan-y

batchsize 4 only cost 4Ggpu memory. you can improve the batchsize. Are you saying that the larger the batchsize, the better performance?

VisonSpace avatar Jan 11 '22 03:01 VisonSpace

I used your preprocess code (https://github.com/gengshan-y/viser-release/blob/main/preprocess/README.md) but got stuck (pdb) where you break. Commenting out can be trained, but it seems that the program has bugs. you can try it with 'car-turn'

VisonSpace avatar Jan 11 '22 03:01 VisonSpace

Isn't this place data preprocessing code? (https://github.com/gengshan-y/viser-release/blob/main/preprocess/README.md)

VisonSpace avatar Jan 11 '22 03:01 VisonSpace

Can you point me where the break happens?

Unfortunately, I have limited capacity do further experiments. In my experience, larger batch size stabilizes training and improves performance. But note that solving dynamic shapes with large deformation from 2D is very under-constrained. I cannot guarantee it works even 2x batch size is used.

You are welcome to test it and let me know if that works.

gengshan-y avatar Jan 11 '22 03:01 gengshan-y

Re

Isn't this place data preprocessing code? (https://github.com/gengshan-y/viser-release/blob/main/preprocess/README.md)

I need to make it clear that the codebase's preprocessing code is complete

VisonSpace avatar Jan 11 '22 04:01 VisonSpace

the result on car-turn is far from useful

VisonSpace avatar Jan 11 '22 12:01 VisonSpace

at the beginning, the train/flowobs map is black (on tensorboard).

VisonSpace avatar Jan 11 '22 12:01 VisonSpace

I remembered it wrong and the pre-processing code was tested. I would suggest do the following.

  1. check whether flow is correctly computed in the FlowFW/Full-Resolution/$sequence-name/ folder
  2. check whether the code loads "black" flow images for breakdance sequence.
  3. find the difference of data format between your sequence and breakdance sequence

gengshan-y avatar Jan 11 '22 15:01 gengshan-y