mediapipe icon indicating copy to clipboard operation
mediapipe copied to clipboard

Lack of documentation for compiling MediaPipe as a TensorFlow.js WASM package

Open ginesmartinezros opened this issue 1 year ago • 7 comments

Description of issue (what needs changing)

The current MediaPipe documentation does not provide instructions or examples on how to compile the Face Mesh model (for instance) as a WASM package for TensorFlow.js.

Clear description

The current MediaPipe documentation does not provide instructions or examples on how to compile the Face Mesh model (for instance) as a WASM package for TensorFlow.js. I’m interested in that because I want to change the built in FaceDetector Short-Range to a Full-Range model (in legacy solutions). Frameworks such as Bazel and Emscripten are involved, and possible configurations such as SIMD and XNNPACK are crucial for performance, but are poorly documented. Mediapipe solutions for TFJS API have a great performance compared to native TFJS models, so I consider it would be necessary for developers to know those compiling instructions.

Relevant links:

Mediapipe’s JS solutions, with no compiling instructions https://github.com/google-ai-edge/mediapipe/blob/master/docs/getting_started/javascript.md

Mediapipe’s Face Mesh model, with no TFJS compiling option https://github.com/google-ai-edge/mediapipe/blob/master/docs/solutions/face_mesh.md

TFJS Mediapipe Face Mesh: https://github.com/tensorflow/tfjs-models/tree/master/face-landmarks-detection/src/mediapipe

Desired libraries to compile: https://www.npmjs.com/package/@mediapipe/face_mesh

The most similar solution I have found https://github.com/prantoran/mediapipe/tree/latest_wasm

Correct links

Yes but incomplete.

Parameters defined

Documentation should explain the necessary build parameters for configuring the compilation of the Mediapipe models specifically for the WASM output targeting TensorFlow.js.

Returns defined

No response

Raises listed and defined

No response

Usage example

No response

Request visuals, if applicable

No response

Submit a pull request?

No

ginesmartinezros avatar Sep 06 '24 11:09 ginesmartinezros

I am trying to setup wasm and I got this error below when building XNNPACK, no idea how to proceed

ERROR: /root/.cache/bazel/_bazel_root/f39021be9f712a854cfac67637005ae8/external/XNNPACK/BUILD.bazel:804:36: Compiling external/XNNPACK/avxvnniint8_prod_microkernels.c failed: (Exit 1): gcc failed: error executing command (from target @XNNPACK//:avxvnniint8_prod_microkernels) /usr/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 '-D_FORTIFY_SOURCE=1' -DNDEBUG -ffunction-sections ... (remaining 102 arguments skipped)

Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging
gcc: error: unrecognized command-line option '-mavxvnniint8'; did you mean '-mavxvnni'?

prantoran avatar Sep 25 '24 03:09 prantoran

Hi @prantoran,

Thanks for your response and for all your work on this topic. I have been closely following your threads and repositories.

I successfully compiled MediaPipe's face_mesh as WASM using your fork (link). However, this solution is packaged with face_detector and selfie_segmentation. Would it be possible to isolate the face_mesh module? Additionally, when I tested it, it ran at approximately 1 FPS. Could you provide some more information regarding the configuration of the model during your build?

Ideally, I was aiming for a solution similar to this one, which can be integrated directly into TensorFlow.js. Do you know of any workaround to achieve this?

Additionally, regarding the XNNPACK error you mentioned, could you provide details on the compilation process you're following?

Lastly, I came across this comment stating that no JavaScript documentation is expected for legacy MediaPipe Solutions, but there is for the new Tasks Vision API. I looked into it, and it seems possible to use this BUILD file. The model is available here, but I’m not sure how to build this packaged solution while introducing some changes. Do you have any insights?

Thanks again for your help!

ginesmartinezros avatar Sep 25 '24 15:09 ginesmartinezros

Any hints @kuaashish ? Thanks!

ginesmartinezros avatar Sep 25 '24 15:09 ginesmartinezros

@ginesmartinezros If you can build and run the branch then you can comment out in the cpp (probably this), build using make build from root dir and comment out the face mesh drawing in renderer.js.

But I can't build my own code at the moment 😅

prantoran avatar Sep 26 '24 05:09 prantoran

For setting up the compilation, I was looking into how tfjs-backend-wasm was setup. I did clean up the cache using bazel clean --expunge but did not work.

prantoran avatar Sep 26 '24 05:09 prantoran

@ginesmartinezros If you can build and run the branch then you can comment out in the cpp (probably this), build using make build from root dir and comment out the face mesh drawing in renderer.js.

But I can't build my own code at the moment 😅

Thanks for the answer. So this is related to the 1 FPS? If it is the case, the performance is not optimal, maybe including SIMD would help. Or to discarding face_detection and selfie_segmentation? Definitely the main.cpp and render.js are the correct place to discard or add MediaPipe modules.

To compile your branch, I had to do some minor changes (in Linux) https://github.com/ginesmartinezros/mediapipe-wasm-faceLandmarkFullRange/tree/develop-linux

ginesmartinezros avatar Sep 26 '24 08:09 ginesmartinezros

For setting up the compilation, I was looking into how tfjs-backend-wasm was setup. I did clean up the cache using bazel clean --expunge but did not work.

Yeah, I also found https://github.com/emscripten-core/emsdk/tree/master/bazel quoted in tfjs-backend-wasm, it's a nice guide. But I think https://www.npmjs.com/package/@mediapipe/face_mesh nor https://github.com/prantoran/mediapipe/tree/latest_wasm use directly tfjs-backend-wasm. This is like a runtime for TF.js models, but https://www.npmjs.com/package/@mediapipe/face_mesh has already its WASM binaries (with SIMD support but not multithreading support). Have you tried Tasks Vision? It has apparently a better compilation support for JS and web, even though I don't know how to modify their .task models.

ginesmartinezros avatar Sep 26 '24 09:09 ginesmartinezros