expansion icon indicating copy to clipboard operation
expansion copied to clipboard

How should i interpret negative Z values?

Open tonytu16 opened this issue 4 years ago • 7 comments

Hello, thank you for the great work! I have a question regarding using the following equation to calculate the Z values:

Screen Shot 2021-01-05 at 11 35 41 AM

I used your code to calculate the tau for the following image:

frame000400

and visualized the Z output on a rainbow colormap:

absolute400

Screen Shot 2021-01-02 at 5 16 27 PM

I have also ran the time-to-collision calculation on the same frame:

Screen Shot 2021-01-01 at 7 07 43 PM

ttc400

Screen Shot 2021-01-02 at 3 32 01 PM

The black color represents negative values. From my understanding, both the absolute depth and time to collision should be non-negative. However, a significant portion of the frame has negative values. The pfm files output tau values that range from roughly -0.02 to 0.01 which, when taken the exponential, range roughly from 1.01 to 0.98. When I subtract it from the one matrix, this create the negative values. Am I misunderstanding something?

Thank you

tonytu16 avatar Jan 05 '21 19:01 tonytu16

Hi, the negative time-to-collision in your results is expected for points appear to be moving away from the camera.

To get reasonable TTC, one assumption in this work is that points are moving towards the camera, such that (Z-Z')>0. Another assumption is that points are moving in constant velocity, such that velocity = (Z-Z')/T_sampling and TTC=Z/velocity.

With that being said, the absolution value of ttc can be interpreted as "time to doubling the distance" in the negative ttc case when points appear to be moving away.

gengshan-y avatar Jan 06 '21 02:01 gengshan-y

So to obtain the time & absolute depth I should take the absolute value of the negative values and divide by 2? Does this apply to both the TTC and Z?

tonytu16 avatar Jan 06 '21 18:01 tonytu16

Negative TTC implies the points will never collide with the imaging plane so the time would be infinity.

For depth I think taking the absolute value of TTC without dividing by 2 will work.

gengshan-y avatar Jan 07 '21 07:01 gengshan-y

Hello, I think that in order to use this formula to accurately calculate the depth, you need to ensure that the target is static, and how to use this depth estimation formula in a dynamic situation @gengshan-y

Liyiwei12138 avatar Apr 23 '24 10:04 Liyiwei12138

Hi, you are right about the rigidity assumption. For dynamic scenes, one thought is to break it into locally rigid pieces, and apply the same algorithm for each piece. But note that there is a scale ambiguity between pieces. You may want to look at superpixel soup and rigidmask for a deeper analysis.

gengshan-y avatar Apr 23 '24 15:04 gengshan-y

Thank you for your response. I would like to confirm the rationale behind the formula [ Z=(1/1-tau)* tcz] in the paper for predicting dynamic objects. In my opinion, when both the camera and the object are in motion, it should be [ Z=(1/1-tau)* (tcz-tmo)], where tcz represents the camera motion and tmo represents the object motion. This formulation would be more reasonable, as neglecting tmo could lead to significant errors. Additionally, I believe that the errors you mentioned in obtaining depth through triangulation on the object in the paper are also due to the dynamic nature of the object, making accurate triangulation difficult. @gengshan-y

Liyiwei12138 avatar Apr 23 '24 15:04 Liyiwei12138

Hi, after breaking the dynamic scene into rigid pieces, t_cz would become the relative motion between the camera and the rigid piece. Such relative motion can be computed as t_cz - tmo.

If I understand your second point correctly, our analysis in Sec 4.5 is making a different point than dynamic objects. When the scene or piece is rigid, triangulating correspondence near the epipole leads to large depth error, becase of small "baseline".

Please feel free to follow up if you have additional questions.

gengshan-y avatar May 04 '24 21:05 gengshan-y