mediapipe
mediapipe copied to clipboard
Support for Depth-Aware Cameras (E.g. Intel RealSense, Windows Hello)
MediaPipe Solution (you are using)
No response
Programming language
No response
Are you willing to contribute it
None
Describe the feature and the current behaviour/state
There are currently various cameras on the market in the $100-$200 price range, which are able to provide RGB channels alongside an IR channel. As MediaPipe models are trained on RGB, they are unable to make use of the extra channel for improved accuracy/performance. This feature would make it possible to use models that can accept RGB + IR camera feeds.
Will this change the current API? How?
The API probably needs to support 4 channel input in order to include the Infrared alongside RGB.
Who will benefit with this feature?
Users in low-light conditions, as well as users with high performance requirements like VTubers
Please specify the use cases for this feature
An example is that a subset of users, known as "VTubers". These users currently use iPhone X and above as their primary face-tracking camera. This is due to the significant difference in tracking performance offered by Apple ARKit's internal model vs MediaPipe; many users will opt to forgo using any camera entirely if the cost of an iPhone is too prohibitive, rather than accept the limitations of MediaPipe. ARKit is able to offer this performance improvement because the model computes using 4-dimensional bitmaps (RGB + Infrared), while MediaPipe uses only RGB. As there exists depth-sensing cameras on the market with significantly lower cost than iPhone X (e.g. Logitech Brio 4K with Windows Hello), this would provide a low-cost alternative with comparable performance to ARKit.
Other examples: see #2645 and https://github.com/google-ai-edge/mediapipe/issues/2920#issuecomment-1673189436 (low-light conditions in hospitals)
Any Other info
No response