rotationnet
rotationnet copied to clipboard
How many CNNs are used?
Hi,
I am confused about how you handle both pose estimation and category classification. For this purpose, do you use different CNNs? When I have checked the code it seems there is a single model. However, when I checked the paper I deduced several CNNs are used (mainly, Figure2 demonstrate some important issues, I guess I missed the main idea). If you could explain the architecture clearly, I would appreciate. Thanks in advance.
Hi,
In Fig. 2, the weights of the CNNs are shared and so there is actually a single CNN model. That CNN model outputs a M(N+1)-dimensional vector for each image, where M denotes the number of discrete poses and N denotes the number of categories. Therefore, a single CNN classifies both category and pose.