MiDaS Converting GT labels into Disparity Space

Thank you for your contributions and for your amazing work.

Your paper mentions that you perform prediction in disparity space in order to handle representation, scale and shift ambiguities on multiple datasets, however I could not figure out how you convert ground truths depths into disparity maps before applying your loss functions.

I know the depth disparity relation as the following form: depth = (baseline * focal length) / disparity thus for calculating disparity: disparity = (baseline * focal length) / depth but how do I decide on baseline and focal length parameters? (For instance DIML-Indoor ground truths are provided as 16bit png depths, how to convert it to disparity space before feeding it to the loss?)

Moreover the paper mentions "We shift and scale the ground-truth disparity to the range [0, 1] for all datasets.". Is this based on across all datasets statistics or according to a fixed range?

Thank you for your guidance and precious time.

Nov 04 '21 18:11 sami-automatic

I've been thinking about the same, so I wont create new issue. I've been reading a lot of issues over this repository and DPT, as Im planning on training DPT, and paper. As I understand training happens in disparity space, and as it was mentioned in one of the issues disparity is proportional to inverse depth.

So, for example I have depth dataset from game engine which provides depth encoded in range [0, 255]. To convert it to inverse depth I would have to simply 1.0 / D , which makes it proportional to disparity and in [0, 1] range. After it this dataset will be good to go for training?
If we have disparity dataset, we do the same conversion to scale dataset to range [0, 1]?
If we have SfM dataset we do same step with motivation from 1?

Nov 07 '21 12:11 InfiniteLife

I've been thinking about the same, so I wont create new issue. I've been reading a lot of issues over this repository and DPT, as Im planning on training DPT, and paper. As I understand training happens in disparity space, and as it was mentioned in one of the issues disparity is proportional to inverse depth.

So, for example I have depth dataset from game engine which provides depth encoded in range [0, 255]. To convert it to inverse depth I would have to simply 1.0 / D , which makes it proportional to disparity and in [0, 1] range. After it this dataset will be good to go for training?

If we have disparity dataset, we do the same conversion to scale dataset to range [0, 1]?

If we have SfM dataset we do same step with motivation from 1?

Hi, Did you find a workaround about how to "shift and scale the ground-truth disparity to the range [0, 1] for all datasets". I want to reproduce DPT(MidaS 3.0) too and do not know how to proprocess the datasets.

Mar 24 '23 05:03 Twilight89

Same problem

Sep 19 '23 14:09 CJCHEN1230

我一直在考虑同样的问题，所以我不会创建新的问题。我已经阅读了很多关于这个存储库和 DPT 的问题，因为我正在计划培训 DPT 和论文。据我了解，训练发生在视差空间中，正如其中一个问题中提到的，视差与深度成反比。

因此，例如，我有来自游戏引擎的深度数据集，它提供了 range 编码的深度[0, 255]。要将其转换为逆深度，我必须简单地进行转换1.0 / D，这使得它与视差和范围成正比[0, 1]。之后这个数据集适合用于训练吗？

如果我们有视差数据集，我们会进行相同的转换以将数据集缩放到范围[0, 1]？

如果我们有 SfM 数据集，我们会出于以下动机执行相同的步骤1？

您好，您是否找到了有关如何“将所有数据集的地面实况差异转移和缩放到 [0, 1] 范围”的解决方法。我也想重现 DPT(MidaS 3.0)，但不知道如何处理数据集。

hi, do you know how to train metric depth dataset(DIML) and relative depth dataset(RedWeb) together? I have the same question.

May 07 '24 03:05 puyiwen

MiDaS MiDaS copied to clipboard

Converting GT labels into Disparity Space

MiDaS
MiDaS copied to clipboard