mediapipe icon indicating copy to clipboard operation
mediapipe copied to clipboard

When will the new, improved blendshape prediction model v3 be implemented into Mediapipe?

Open ferdiwerthi opened this issue 1 year ago • 10 comments

Hello dear Mediapipe team,

our team found this Google Research Paper “Blendshapes GHUM: Real-time Monocular Facial Blendshape Prediction” from 11. September 2023, which describes a new, improved blendshape prediction model that seems to solve all problems of the current model.

We would thus love to learn more from your team about the timeline for the new model's implementation into Mediapipe.

I believe that detailed and accurate facial animation is at THE CORE of realistic character animation and thus precise blendshape prediction is of huge importance. The new model promises a huge leap forward in facial animation quality for digital characters of all types, be it photorealistic humans, stylized anime, cartoon characters or any other type that comes to your mind.

The current model is only suitable for animating chracters with little detail, such as cartoon and anime characters, where fewer and less accurate blendshape data seems to be kind of tolerable. Although the current model even limits those experiences, since many basic emotions are not replicable due to missing or inaccurately predicted blendshape data. However, the blendshape data is unusable to animate realistic human characters (which is our team's use case), and lagging far behind the blendshape prediction quality of Apples ARKit and NVIDIA Maxine. This comparsion video between AppleARKit, NVIDIA Maxine and Mediapipe shows the huge quality gap very clearly and pretty much reflects our extensive testing results with all 3 prediction models.

It would be amazing if you, the Mediapipe team, could implement the new, improved model some time soon. It would instantly improve a ton of people's projects, make their user's experiences much better and unlock many new use cases for the community.

If anyone out there is interested in this model as much as my team and I, please show the Google Mediapipe team your interest by upvoting this issue.

A huge THANK YOU to the Google Mediapipe team, the Google Research team, and to anyone in the community contributing to this great project!!!

Greetings, Ferdinand


Mediapipe blendshape prediction problems we ran into (current blendshape model v2)

  • Missing blendshape data ('noseSneer', 'mouthFrown', 'jawForward', 'cheekSquint', 'cheekPuff')
  • Inacurate blendshape data ('eyeWide', 'mouthDimple', 'mouthPucker')
  • Unstable expression recognition during head rotations
  • Left and right blendshape values differ strongly (up to 40% difference)
  • eye glitches
  • same problems occur either using face landmarker or the holistic model (as expected)
  • calibration of min-values (neutral face) and max-values (to capture the max motion range of the user) didn't bring much improvements with Mediapipe due to the missing and unprecise initial blendshape data. Calibrating with Maxine and Apple ARKit brought much better results.

The result:

  • The expression of basic emotions like fear, anger, disgust and laughter are not possible with Mediapipes blendshape model (calibration won't help either) and the constant glitches of the eyes make it unusable for many intended use cases.

Related issues about missing or imprecise blendshape data

  • If Face BlendShape is a production ready solution? - https://github.com/google/mediapipe/issues/4210
  • eyeWidenLeft/eyeWidenRight/noseSneerLeft/noseSneerRight blendshapes are always near 0 - https://github.com/google/mediapipe/issues/4450
  • cheekPuff blendshape is always near 0 - https://github.com/google/mediapipe/issues/4436
  • tongueOut is missing in blendshapes output - https://github.com/google/mediapipe/issues/4403

ferdiwerthi avatar Apr 18 '24 15:04 ferdiwerthi

Hi @yichunk,

Could you please look into this issue?

Thank you!!

kuaashish avatar Apr 23 '24 09:04 kuaashish

Hello @yichunk @kuaashish Our team and I would be very thankful if you could share any updates, since it's a crucial topic for our current project.

We're already looking forward to the day that we can reimplement Mediapipe with a newer, mightier blendshape prediction model into our WebApp :)

Thank you very much, Ferdinand

ferdiwerthi avatar May 17 '24 15:05 ferdiwerthi

Hello dear Mediapipe team. I hope you're all well :)

I was wondering if you can provide any updates on this topic yet?

This is the most upvoted (14) issue as far as I can see and the related issue https://github.com/google-ai-edge/mediapipe/issues/4210 is the 3rd most commented topic (34 comments) amongst all issues so far. I would really appreciate if you could look into this and give a short update. I'm sure that it would help a lot of people.

Thank you very much and many greetings, Ferdinand

ferdiwerthi avatar Sep 05 '24 10:09 ferdiwerthi

Hello dear Mediapipe team. I hope you're all well :)

I was wondering if you can provide any updates on this topic yet?

This is the most upvoted (14) issue as far as I can see and the related issue #4210 is the 3rd most commented topic (34 comments) amongst all issues so far. I would really appreciate if you could look into this and give a short update. I'm sure that it would help a lot of people.

Thank you very much and many greetings, Ferdinand

Hi, I have the same question. Besides, I have a simple question, why the blendshape coefficient one is not always one, as it represents neutral face. I would be appreciate if you can give advice on this.

2hiTee avatar Sep 10 '24 07:09 2hiTee