Gen6D How to improve the results

How to improve the results

Open JaouadROS opened this issue 1 year ago • 1 comments

I've been playing with the Gen6D couple of times with first the mouse and then with custom objects but still can't figure out correctly how to get everything right at the first attempt. So I decided to use a relatively long reference video and the same video as query video but still don't get it right.

In this example, I have a reference video with 6885 frames. Only 689 were used with Colmap to get 3D model. It took 6 hours BTW. Then I estimated the meta info using these instructions and I've got those values:

0.984891 -0.219880 -0.156923
-0.0717203 0.377376 -0.923279

Here is my 3D model. I expected better results with 689 frames: Screenshot from 2023-09-18 17-37-21

After that I run the predict on the same reference video and the detection fails:

1- So how to guarantee the detection at the first attempt? Here I use 689 images for reconstruction and 6840 for detection (reference images are a subset of query images) but the detection still fails. Having the same datasets (reference and query) and the detection still fails, does it mean that meta data aren't correct? I don't see any reason why it fails in that case. Other reason why it is strange to me is the fact that the scale difference between the reference and the query images is nearly the same.

But after couple of frames, the tracking works correctly. Here are the frames: Frame 1/6885 Frame 624/6885 624 Frame 640/6885 640 Then it works till the end of the video.

2- When I compute the meta data using CC couple of times for the same model (without closing CC), I get slightly different results, this is normal behavior and does it affect the performance of Gen6D?

Sep 18 '23 16:09 JaouadROS

Hi, the detection is indeed not very stable especially when there is a large scale difference. About the scale difference, you may refer to this https://github.com/liuyuan-pal/Gen6D/issues/29#issuecomment-1308379909. We will always resize the reference image so that there is still a large scale difference even using the original reference sequence as a query. The results suddenly get better because we will crop the image according to the current pose, the refiner happens to find a pose near to the object and thus the subsequent steps are correct. You may refer to intermediate results as stated here https://github.com/liuyuan-pal/Gen6D#qualitative-results.
Slightly changing the metadata will not change the final results. It's a normal behavior of Gen6D.

Sep 22 '23 01:09 liuyuan-pal

Gen6D Gen6D copied to clipboard

How to improve the results

Gen6D
Gen6D copied to clipboard