expansion
expansion copied to clipboard
How should i interpret negative Z values?
Hello, thank you for the great work! I have a question regarding using the following equation to calculate the Z values:
I used your code to calculate the tau for the following image:
and visualized the Z output on a rainbow colormap:
I have also ran the time-to-collision calculation on the same frame:
The black color represents negative values. From my understanding, both the absolute depth and time to collision should be non-negative. However, a significant portion of the frame has negative values. The pfm files output tau values that range from roughly -0.02 to 0.01 which, when taken the exponential, range roughly from 1.01 to 0.98. When I subtract it from the one matrix, this create the negative values. Am I misunderstanding something?
Thank you
Hi, the negative time-to-collision in your results is expected for points appear to be moving away from the camera.
To get reasonable TTC, one assumption in this work is that points are moving towards the camera, such that (Z-Z')>0
. Another assumption is that points are moving in constant velocity, such that velocity = (Z-Z')/T_sampling
and TTC=Z/velocity
.
With that being said, the absolution value of ttc can be interpreted as "time to doubling the distance" in the negative ttc case when points appear to be moving away.
So to obtain the time & absolute depth I should take the absolute value of the negative values and divide by 2? Does this apply to both the TTC and Z?
Negative TTC implies the points will never collide with the imaging plane so the time would be infinity.
For depth I think taking the absolute value of TTC without dividing by 2 will work.
Hello, I think that in order to use this formula to accurately calculate the depth, you need to ensure that the target is static, and how to use this depth estimation formula in a dynamic situation @gengshan-y
Hi, you are right about the rigidity assumption. For dynamic scenes, one thought is to break it into locally rigid pieces, and apply the same algorithm for each piece. But note that there is a scale ambiguity between pieces. You may want to look at superpixel soup and rigidmask for a deeper analysis.
Thank you for your response. I would like to confirm the rationale behind the formula [ Z=(1/1-tau)* tcz] in the paper for predicting dynamic objects. In my opinion, when both the camera and the object are in motion, it should be [ Z=(1/1-tau)* (tcz-tmo)], where tcz represents the camera motion and tmo represents the object motion. This formulation would be more reasonable, as neglecting tmo could lead to significant errors. Additionally, I believe that the errors you mentioned in obtaining depth through triangulation on the object in the paper are also due to the dynamic nature of the object, making accurate triangulation difficult. @gengshan-y
Hi, after breaking the dynamic scene into rigid pieces, t_cz would become the relative motion between the camera and the rigid piece. Such relative motion can be computed as t_cz - tmo.
If I understand your second point correctly, our analysis in Sec 4.5 is making a different point than dynamic objects. When the scene or piece is rigid, triangulating correspondence near the epipole leads to large depth error, becase of small "baseline".
Please feel free to follow up if you have additional questions.