MoGe icon indicating copy to clipboard operation
MoGe copied to clipboard

Stereo

Open iariav opened this issue 11 months ago • 1 comments

Hi, Thanks for the truly amazing work. I was wondering if you have any plan to support stereo images in the future, to get even more precise depth estimation leveraging the added information. I have a dataset of stereo images, and even though running MoGe on just one of the images already gives me pretty useful results, i want to try and improve them even further utilising the other image.
thanks

iariav avatar Jan 19 '25 14:01 iariav

Hi, thank you for your interest! Your idea sounds solid and straightforward! We could consider extending the self-attention mechanism or adding cross-attention layers in the ViT to enable multi-image inputs and adopt similar end-to-end training. However, collecting stereo data and reformulating the model would require significant effort. I believe it would take us another paper to develop a well-grounded solution. We will consider it in future research. Thanks again for your suggestion!

EasternJournalist avatar Feb 22 '25 15:02 EasternJournalist