FFB6D icon indicating copy to clipboard operation
FFB6D copied to clipboard

Support for multiple instances of same object class

Open mikkeljakobsen opened this issue 3 years ago • 5 comments

Is it possible to train the network on a dataset with multiple instances of the same object class? I've tried to adapt the code to my own dataset but I'm unsure how to handle multiple instances of the same object. Best regards, Mikkel

mikkeljakobsen avatar Apr 15 '21 09:04 mikkeljakobsen

It requires some modification to deal with multi instances in one scene using our full models:

  • For data preprocessing, the semantic label is kept still but the center and keypoint offset map should be modified. Specifically, take the center offset map as an example, it should be updated per-instance rather than per class here.
  • The training scripts can be kept still.
  • For inference:
    • The PyTorch Cuda version of the MeanShift algorithm we implemented here only output one cluster center (one instance) for each class. It should be modified to output multi-instance clusters. Or you can use the skilearn version on CPU.
    • After each instance cluster is obtained, the following keypoint and pose estimation procedure (After here) also need to be modified accordingly to output the pose parameter of each instance.

Another way is to utilize on-the-shelf instance segmentation/detection architecture to obtain the RoI of each instance and then cropped them out to train our pose estimation network. In this case, you can shrink the network of FFB6D, eg, replace ResNet34 with ResNet18 to speed up the network.

ethnhe avatar Apr 16 '21 13:04 ethnhe

  • For data preprocessing, the semantic label is kept still but the center and keypoint offset map should be modified. Specifically, take the center offset map as an example, it should be updated per-instance rather than per class here.

Thanks for your answer. I tried to modify my data loader class according to your instructions. However, when I start training, I get this error:

RuntimeError: stack expects each tensor to be equal size, but got [6, 3, 4] at entry 0 and [5, 3, 4] at entry 2

I think the problem is that the number of instances differs from frame to frame, e.g. there may be 6 instances in one frame but only 5 instances in another. How do I deal with this issue?

mikkeljakobsen avatar Apr 20 '21 07:04 mikkeljakobsen

Has this problem been solved?Who can share this code or more detailed suggestions?

hz-ants avatar Jul 11 '21 11:07 hz-ants

I'd also like to know if this has been solved :)

leoflcn avatar Jun 13 '22 22:06 leoflcn

Hi, can someone point me towards correcting the code to output poses for all objects in the scene?

nachi9211 avatar Aug 31 '22 12:08 nachi9211