CADepth-master icon indicating copy to clipboard operation
CADepth-master copied to clipboard

Question about pose prediction output in mono_model

Open ReekiLee opened this issue 2 years ago • 2 comments

Hello! Thanks for this great work! When training with mono_model, the output tensor size of the posenet is [B, 2, 1, 3] for both axisangle and translation. But when calculating the Transformation metrix, only the axisangle[:, 0] and translation[:, 0] are needed (referring here: https://github.com/kamiLight/CADepth-master/blob/8251f12f21393aae3261c3765218063cea1cae30/trainer.py#L289 Could you tell me why design the output of posenet in this way? Is it ok to modify the last layer of the posenet to a [B, 1, 1, 6] output including [B, 1, 1, 3] axisangle and [B, 1, 1, 3] translation? Thank you in advance!

ReekiLee avatar Feb 04 '23 11:02 ReekiLee

the axisangle of i-1^{th} frame and i^{th} frame is axisangle [:, 0], and axisangle [:, 1] is for i^{th} frame and i+1^{th} frame, so the channel of axisangle is 2 instead of 1, which is same as the translation

Here we store the axisangle into a dict https://github.com/kamiLight/CADepth-master/blob/8251f12f21393aae3261c3765218063cea1cae30/trainer.py#L284

And we use axisangle at this line https://github.com/kamiLight/CADepth-master/blob/8251f12f21393aae3261c3765218063cea1cae30/trainer.py#L362

kamiLight avatar Feb 04 '23 12:02 kamiLight

Hi @kamiLight , when the input is: pose_inputs = [pose_feats[-1], pose_feats[0]] the output: axisangle, translation = self.models"pose" there are only one pair frames (-1, 1), but get a 2 channel axisangle [B, 2, 1, 3]. And when calculating the Transformation metrix, only the axisangle[:, 0] and translation[:, 0] are used.

Could you check again? Thank you for your time!

ReekiLee avatar Feb 04 '23 14:02 ReekiLee