unidet3d Questions about training with my own object, point cloud–only input, and fine-tuning vs. training from scratch

Hello, and thank you for your excellent work!

I have a few questions regarding training the model with my own object:

I would like to detect my own custom object in a scene. How should I prepare and train the model for this use case? Is this supported? My current plan is to generate a synthetic dataset using BlenderProc, and then test the trained model in real-world scenes.

I am using a depth-only camera, and I do not have RGB images. Is it possible to train and run the model using only point clouds (depth) without RGB information?

For best results, do you recommend that I fine-tune the existing model, or train a new model from scratch for my custom object?

Thank you very much for your outstanding work, and I appreciate any guidance!

Nov 21 '25 06:11 rooftop88

Hi @rooftop88 ,

Yes it is possible to train on point clouds w/o RGB information. You can either modify the first convolutional layers from 3 channels to 1, or duplicate your 1 channel 3 times.

Before training UniDet3D you should extract superpoints for your data, which is not that straightforward. We notice training from scratch to be not stable, so we start from OneFormer3D checkpoint.

Also TR3D may be a better solution for your case. At least it doesn't require superpoints, and trains much faster w/o hungarian algorithm in matching.

Nov 28 '25 12:11 filaPro

Thanks for your response. I tried TR3D and it works. Thanks for your suggestion!

Dec 03 '25 03:12 rooftop88