FoundationPose Frequent Pose Estimation Failures on Custom Data Despite Following Repository Instructions

Hello,

First of all, thank you for your excellent work on FoundationPose. We have followed the repository instructions carefully and prepared all the required data, including:

Depth stream
RGB stream
3D mesh (OBJ format)
Mask of the first frame
Camera intrinsics However, when running FoundationPose on our own data, we frequently encounter the issue shown in the attached video.

In particular, we suspect that the performance of FoundationPose might degrade significantly in scenarios with camera shake or when the camera moves forward/backward relative to the object. The problem in our video appears to be more severe under such conditions.

https://github.com/user-attachments/assets/b56730fe-ec2c-4eab-8a78-7dfdddf22a4e

Could you please advise on how we might optimize or debug this part of the pipeline? Where would you recommend we start our investigation?

Thank you for your time and support.

Aug 14 '25 09:08 Vincia-Jun

Hi ,

Can I ask how did you train it for your custom object ? or was this object already included in the "Google Scanned Objects" dataset ?

Aug 14 '25 10:08 Manu752

can you please provide a guideline, how have you done this and have you solved the problem?

Aug 26 '25 21:08 Dourjoyrup