AvatarPoser How are the leg poses estimated from only head and hand poses offered, intuitively?

Hi, thanks for your really great work!

Just a silly question, for this "full-body pose tracking from sparse motion sensing" task, how are the leg poses estimated, intuitively? I think there could be severe ambiguity of leg poses given one pair of hand & hands poses.

For example, in the demo of Figure 7

For the rightmost case, how could it know which leg is at the front?
For the middle case, how could it know the person is sitting? (Maybe the location of the head indicates the head is below a normal height?)

I think maybe the temporal information can help eliminate some of the ambiguity. For example, a moving head and swinging arms indicate the person is walking?

But I got generally amazed by the really good alignment of the leg poses.

Thanks a lot!

Oct 31 '22 15:10 yd-yin

I have tried to reproduce this paper. In fact, with only head and hand poses, we cannot predict leg poses at all! In addition, in the code inverse kinemetic solver is not used at all! Transformer do not have a very good result to predict human pose (compare with LSTM or CNN).

Apr 09 '23 00:04 ghost

I have tried to reproduce this paper. In fact, with only head and hand poses, we cannot predict leg poses at all! In addition, in the code inverse kinemetic solver is not used at all! Transformer do not have a very good result to predict human pose (compare with LSTM or CNN).

hello，I'd like to ask if you have any papers related to CNN or LSTM (with source code)。thanks

Jan 05 '24 04:01 Recialhot