mediapipe
mediapipe copied to clipboard
Lack of documentation for compiling MediaPipe as a TensorFlow.js WASM package
Description of issue (what needs changing)
The current MediaPipe documentation does not provide instructions or examples on how to compile the Face Mesh model (for instance) as a WASM package for TensorFlow.js.
Clear description
The current MediaPipe documentation does not provide instructions or examples on how to compile the Face Mesh model (for instance) as a WASM package for TensorFlow.js. I’m interested in that because I want to change the built in FaceDetector Short-Range to a Full-Range model (in legacy solutions). Frameworks such as Bazel and Emscripten are involved, and possible configurations such as SIMD and XNNPACK are crucial for performance, but are poorly documented. Mediapipe solutions for TFJS API have a great performance compared to native TFJS models, so I consider it would be necessary for developers to know those compiling instructions.
Relevant links:
Mediapipe’s JS solutions, with no compiling instructions https://github.com/google-ai-edge/mediapipe/blob/master/docs/getting_started/javascript.md
Mediapipe’s Face Mesh model, with no TFJS compiling option https://github.com/google-ai-edge/mediapipe/blob/master/docs/solutions/face_mesh.md
TFJS Mediapipe Face Mesh: https://github.com/tensorflow/tfjs-models/tree/master/face-landmarks-detection/src/mediapipe
Desired libraries to compile: https://www.npmjs.com/package/@mediapipe/face_mesh
The most similar solution I have found https://github.com/prantoran/mediapipe/tree/latest_wasm
Correct links
Yes but incomplete.
Parameters defined
Documentation should explain the necessary build parameters for configuring the compilation of the Mediapipe models specifically for the WASM output targeting TensorFlow.js.
Returns defined
No response
Raises listed and defined
No response
Usage example
No response
Request visuals, if applicable
No response
Submit a pull request?
No
I am trying to setup wasm and I got this error below when building XNNPACK, no idea how to proceed
ERROR: /root/.cache/bazel/_bazel_root/f39021be9f712a854cfac67637005ae8/external/XNNPACK/BUILD.bazel:804:36: Compiling external/XNNPACK/avxvnniint8_prod_microkernels.c failed: (Exit 1): gcc failed: error executing command (from target @XNNPACK//:avxvnniint8_prod_microkernels) /usr/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 '-D_FORTIFY_SOURCE=1' -DNDEBUG -ffunction-sections ... (remaining 102 arguments skipped)
Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging
gcc: error: unrecognized command-line option '-mavxvnniint8'; did you mean '-mavxvnni'?
Hi @prantoran,
Thanks for your response and for all your work on this topic. I have been closely following your threads and repositories.
I successfully compiled MediaPipe's face_mesh as WASM using your fork (link). However, this solution is packaged with face_detector and selfie_segmentation. Would it be possible to isolate the face_mesh module? Additionally, when I tested it, it ran at approximately 1 FPS. Could you provide some more information regarding the configuration of the model during your build?
Ideally, I was aiming for a solution similar to this one, which can be integrated directly into TensorFlow.js. Do you know of any workaround to achieve this?
Additionally, regarding the XNNPACK error you mentioned, could you provide details on the compilation process you're following?
Lastly, I came across this comment stating that no JavaScript documentation is expected for legacy MediaPipe Solutions, but there is for the new Tasks Vision API. I looked into it, and it seems possible to use this BUILD file. The model is available here, but I’m not sure how to build this packaged solution while introducing some changes. Do you have any insights?
Thanks again for your help!
Any hints @kuaashish ? Thanks!
@ginesmartinezros If you can build and run the branch then you can comment out in the cpp (probably this), build using make build from root dir and comment out the face mesh drawing in renderer.js.
But I can't build my own code at the moment 😅
For setting up the compilation, I was looking into how tfjs-backend-wasm was setup. I did clean up the cache using bazel clean --expunge but did not work.
@ginesmartinezros If you can build and run the branch then you can comment out in the cpp (probably this), build using
make buildfrom root dir and comment out the face mesh drawing in renderer.js.But I can't build my own code at the moment 😅
Thanks for the answer. So this is related to the 1 FPS? If it is the case, the performance is not optimal, maybe including SIMD would help. Or to discarding face_detection and selfie_segmentation? Definitely the main.cpp and render.js are the correct place to discard or add MediaPipe modules.
To compile your branch, I had to do some minor changes (in Linux) https://github.com/ginesmartinezros/mediapipe-wasm-faceLandmarkFullRange/tree/develop-linux
For setting up the compilation, I was looking into how tfjs-backend-wasm was setup. I did clean up the cache using
bazel clean --expungebut did not work.
Yeah, I also found https://github.com/emscripten-core/emsdk/tree/master/bazel quoted in tfjs-backend-wasm, it's a nice guide. But I think https://www.npmjs.com/package/@mediapipe/face_mesh nor https://github.com/prantoran/mediapipe/tree/latest_wasm use directly tfjs-backend-wasm. This is like a runtime for TF.js models, but https://www.npmjs.com/package/@mediapipe/face_mesh has already its WASM binaries (with SIMD support but not multithreading support).
Have you tried Tasks Vision? It has apparently a better compilation support for JS and web, even though I don't know how to modify their .task models.