gesticulator icon indicating copy to clipboard operation
gesticulator copied to clipboard

Penetrating hands through spines

Open cl3386 opened this issue 2 years ago • 4 comments

Hi, Svito-zar.

First of all, big thanks for awesome project and paper. I have been exploring your project and successfully achieved bvh data from demo.py script.

However, when I import the bvh data into blender to visualize, I encountered unexpected phenomenon. As attached image(captured on blender), hands are penetrating some bones(mostly bones around spine area). taras3 taras2 taras

I have been investigating why this is happening, and as of now, I couldn't find any clue yet. Can you please share your opinion on this issue? Also, I would be much grateful if you guide me how to prevent this phenomenon by either code level or blender level(preferably code level solution).

Again, I appreciate your awesome project! Many Thanks.

cl3386 avatar May 03 '22 11:05 cl3386

Hi @cl3386 ,

Thank you for your interest in this work.

It is an interesting issue that you encounter, as I am not an expert in Blender - I don't have a good answer for you now. Maybe somebody else can answer this later.

Was there anything particular about the audio track that you used? Did you apply the model on one of the audios provided?

Svito-zar avatar May 04 '22 11:05 Svito-zar

Dear Taras,

I executed Genea Visualizer with sample audio/transcript data inside the github repo, still found the same phenomenon of penetrating hands same as in the Blender.

I am attaching the result video from Genea Visualizer on the sample data(Jeremy Howard). Please take a look at the attached video for better understanding of my question :) https://user-images.githubusercontent.com/40153433/174682075-7308a1a0-75c5-4163-befc-acf6f225aeaa.mp4

I think 3d character visualization might need some adjustment but I have no clue for that. Can you think of any direction to share?

In addition to the question above, I am curious about incredibly efficient training process of the Gesticulator model. As far as I understand, the model uses only 23 training sample(motion data) for training. I am wondering how this relatively(and absolutely) small dataset can train the model so efficiently. Could you explain what made this possible?

Thanks for reading my 2 questions, and I look forward to hearing from you soon :)

Many Thanks.

cl3386 avatar Jun 20 '22 21:06 cl3386

Taras is on holiday this week and might not be able to give you an answer for a little while. In the meantime, I can possibly provide some pointers, but there's a lot that I don't know.

I am attaching the result video from Genea Visualizer on the sample data(Jeremy Howard).

Is this result video generated from recorded motion-capture data, or from output generated by a trained Gesticulator model? If it's the former, which data sequence and time range (start time) are you visualising?

Also, what relationship does Jeremy Howard have with the video that you shared?

As far as I understand, the model uses only 23 training sample(motion data) for training.

What do you define as a "sample" here? Do you mean something like "sequence"?

I think it's possible to train a great motion model on just one single sequence, if that sequence is long enough (i.e., many hours). What matters is the total amount of relevant data that you have to train on, not how many few or pieces it consists of.

ghenter avatar Jun 20 '22 22:06 ghenter

@cl3386 , the model was trained on 4h of recording from the Trinity-Gesture Dataset. It consists of 23 takes of around 10 minutes each. That is not a small amount of data actually :)

As for the actual results of the hands penetrating bones - there is nothing in the model that explicitly prevents it, hence it could happen. It did not happen in the audios from the dataset used. Have you looked at some results on the same speaker as the model was trained on? Under this link are the results I obtained during my experiments.

Svito-zar avatar Jun 29 '22 08:06 Svito-zar

Closing due to inactivity

Svito-zar avatar Nov 14 '22 08:11 Svito-zar