3D-BoundingBox
3D-BoundingBox copied to clipboard
how to train on custom datasets
I have my own datasets whose format are like pascal voc with labeled images only, i do not have the calibration file, how can I train on my own datasts?
@njuxjx hi,I have the same problem, have you found a solution?
I have trained this model on a different dataset than the one used in this repo. For your custom dataset you'll need labelled regions and some more info on those labelled objects, more precisely you'll need :
Bounding Box 2D W, H, L of the object 3D Yaw Angle Camera Intrinsic Matrix
When loading the dataset and preparing it for training, you'll have to compute the local orientation angle of the object, that can be done by finding the arctan(x,z) where X and Z are 3D coordinates of the 3D bounding box center. Then, to find alpha (local orientation) you'll simply do : yaw - arctan(x,z)
@Joywalker thank you for clarifying. I have a few questions, I'd be grateful if you can answer:
W, H, L of the object Do you mean pixel values of W,H,L or the real-life object dimensions?
3D Yaw Angle How did you obtain this? Is alpha and the 3D yaw angle the same thing?
Also what is the label file format that you used to save these values for training?
3D Yaw Angle How did you obtain this? Is alpha and the 3D yaw angle the same thing?
Theta (angle with red) represents the Global Orientation which is the Yaw angle in global plane. Theta ray and Theta l can be computed, as I've explained in the previous comment.
Check out this article, it explains the whole process.
W, H, L of the object Do you mean pixel values of W,H,L or the real-life object dimensions?
That would be 3D real-size dimensions of the object. When preparing the data for training, the average dimension substracts the real W H L of the object from the Average W H L of that object, that helps to define the 3Ddimensions offsets when computing the 3D box.
Thanks!
Hi @Joywalker I am also using my own custom dataset and have all the information you describe above.
I am looking at these comments and am confused as to what angle is what you call 'alpha'. Is this theta l?
Also, just looking at the geometry should the local orientation (theta ray) not be equal to arctan(z/x)? I am also confused why you seemed to have described the arctan function having two arguments (you have written arctan(x, z)).
EDIT -------------------------------------------- EDIT
Apologies, I have just been reading the attached article, and the arctangent function makes sense after reading this. However this article shows that the ray angle is calculated wrt. the principal point of the camera, whereas in the diagram you have attached above it is not taken wrt. the principal point, but wrt. the x axis which is causing me some confusion.