mediapipe icon indicating copy to clipboard operation
mediapipe copied to clipboard

Run Holistic in a Web Worker

Open AmitMY opened this issue 4 years ago • 67 comments

System information (Please provide as much relevant information as possible)

  • Have I written custom code (as opposed to using a stock example script provided in Mediapipe): Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04, Android 11, iOS 14.4): Chrome
  • MediaPipe version: ^0.4.1628005088
  • Solution (e.g. FaceMesh, Pose, Holistic): Holistic
  • Programming Language and version ( e.g. C++, Python, Java): Javsacript

Describe the expected behavior: In order to run models on the web without the browser hanging (i.e. without clogging the main thread), it is recommended to use a web worker.

Like other image models, where one can communicate with a web worker, I am trying to get that to work with Holistic.


Problems:

1. Loading path is not respected

Despite setting up the model this way:

model = new Holistic({locateFile: (file) => `/assets/models/holistic/${file}`});

While the js files are coming from the correct path, the data/wasm files do not: image

(if performed on the main thread, it performs as expected)

2. Passing an image as an OffscreenCanvas fails

It is not possible to pass a HTMLVideoElement, or HTMLCanvasElement to a web worker, so instead, at first I take the video frame to a canvas and send its ImageData instead.

Then, in the web worker, I construct an OffscreenCanvas and send that to the model.

The resulting error is as follows: image

Referring to v.canvas in the holistic dist code: https://unpkg.com/@mediapipe/[email protected]/holistic.js

AmitMY avatar Sep 06 '21 10:09 AmitMY

Hey Amit, I don't see anything wrong right off the bat. But the initiator for the requests to the data and wasm files is different than the js files. locateFile actually takes two parameters. Can you do me a favor and check to see if it's just giving you weird signatures (weirder than what you've posted already, that is)?

    this.holisticSolution = new holistic.Holistic({
      locateFile: (path, base) => {
        console.log('path', path, 'base', base);
        return `${base}/${path}`;
      }
    });

And good job with the OffscreenCanvas! We should be supporting that properly in the near future, since a lot of us want to use worker threads.

mhays-google avatar Sep 07 '21 01:09 mhays-google

Just to chime in, reading video on the main thread and sending it over to the worker is definitely the right overall approach for now, although I had a few additional pointers:

  • Since you're using Chrome, you can get much better performance by avoiding ImageData, as ImageData forces our data to be transferred on the CPU. ImageBitmap should be used instead, so all resources stay on the GPU.

  • You shouldn't need to pass the image as an OffscreenCanvas-- ImageBitmap should suffice, since it's just being used to create a WebGLTexture anyways-- if you want to use ImageData instead, you can just cast it to another type, like HTMLVideoElement (although you might need to give it 'videoWidth' and 'videoHeight' properties, set equal to 'width' and 'height', respectively).

  • However, you're right that OffscreenCanvas is what's missing here-- the JS Solutions framework is currently hard-coded to create an HTMLCanvasElement for internal use, so a patch in MediaPipe code would be necessary to avoid the "canvas undefined" error you're seeing.

tyrmullen avatar Sep 07 '21 18:09 tyrmullen

@mhays-google Here is the console output for path and base:

image

@tyrmullen Thanks, I am now doing const bitmap = await createImageBitmap(video); and passing that to the worker. I could either draw it on an OffscreenCanvas or just pass it directly to Holistic, however, both are not currently supported because, like you said, it creates a canvas which is not available in the worker context.

AmitMY avatar Sep 09 '21 15:09 AmitMY

@AmitMY for the loading path, maybe try the URL constructor with origin as second argument?

model = new Holistic({
  locateFile: (file) =>
    new URL(`/assets/models/holistic/${file}`, globalThis.location.origin).toString()
});

tvanier avatar Sep 21 '21 10:09 tvanier

and +1 for a lower-level API (ImageBitmap) with support for web workers (I use selfie-segmentation) where should I upvote? ;-)

tvanier avatar Sep 21 '21 11:09 tvanier

I found a way for web worker:

  1. Beatify holistic.js
  2. Add this code after holistic.js line 993: d.C = d.h.GL.currentContext.GLctx (after d.h.GL.makeContextCurrent(k);)
  3. Put your worker file in the same folder with holistic.js
  4. For every message you post to worker, do not post new messages before you receive result.

This is under assumption of vanilla js project.

face_mesh is fixed in this way too (d.D = d.h.GL.currentContext.GLctx; and line 1632)

Working vanilla js example: https://alexshafir.github.io/Sensoria/ Source: https://github.com/AlexShafir/Sensoria You need to rotate camera on the right in (+X & +Y) direction to see face mesh in 3D.

AlexShafir avatar Sep 26 '21 14:09 AlexShafir

Thanks for the workaround, @AlexShafir However, @sgowroji, I still argue for a native solution in the package, and support for OffscreenCanvas or better yet, an ImageBitmap

AmitMY avatar Sep 26 '21 19:09 AmitMY

@AmitMY In my demo I pass ImageBitmap directly to holistic.js index.js:

createImageBitmap(videoElement).then((bitmap) => {
        holisticWorker.postMessage(bitmap, [bitmap]) // transferable
    })

holistic_worker.js:

onmessage = (event) => {
    holistic.send({image: event.data})
}

AlexShafir avatar Sep 27 '21 04:09 AlexShafir

For me bummer is that even after workaround holistic.js is ultra slow on high-end laptop, so I switched to face_mesh.js which shows a lot better performance.

AlexShafir avatar Sep 27 '21 04:09 AlexShafir

new URL(`/assets/models/holistic/${file}`, globalThis.location.origin).toString()

Thanks, @tvanier , this does not change the result, unfortunately. The path is still not correctly resolved

AmitMY avatar Sep 27 '21 08:09 AmitMY

Workers are on our roadmap, but I unfortunately don't have a window to provide this. ImageBitmap is already an output format where supported (i.e., Chrome), but if I hear you right, you would like an input format of ImageBitmap?

On Mon, Sep 27, 2021 at 1:33 AM Amit Moryossef @.***> wrote:

new URL(/assets/models/holistic/${file}, globalThis.location.origin).toString()

Thanks, @tvanier https://github.com/tvanier , this does not change the result, unfortunately. The path is still not correctly resolved

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/google/mediapipe/issues/2506#issuecomment-927650052, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHKQBFJ4TTI4U3SKZ6LHC23UEAT4NANCNFSM5DQFYGDQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

mhays-google avatar Sep 27 '21 16:09 mhays-google

ImageBitmap is already an input format as well, as demonstrated by @AlexShafir .

tyrmullen avatar Sep 27 '21 22:09 tyrmullen

@AlexShafir Your changes seem to be minimal. If it's not too much trouble, could you please contribute a fix to this repository?

AmitMY avatar Sep 29 '21 12:09 AmitMY

@AmitMY Source code for Js Solutions is not released, see #1408 (basically wasm does not fit nicely into Bazel). So one cannot make PR.

My solution is just a workaround. To make it complete, one needs to add proper path handling - at least.

AlexShafir avatar Sep 29 '21 17:09 AlexShafir

(Can confirm, 0.4.1633559476 did not fix problems 1 and 2)

AmitMY avatar Oct 11 '21 06:10 AmitMY

As mhays-google@ mentioned, this is not an issue we have an ETA for at this time.

As an aside-- if the purpose of switching to a worker thread is to improve performance, that is probably not going to happen here. Most of the processing taking place is for the ML inference, and the majority of the remainder is for rendering. Both of these are going to be occurring GPU-side, and (at least on Chrome) there's a single shared GPU process for all tabs and threads, so you're unlikely to get any parallelization benefits off the bat. There are other reasons why workers can be useful, of course, but for pure performance, they're unlikely to help here.

Note that this also will make it tricky to use workers to unblock the main thread currently, since any CPU/GPU sync (any time we "wait for GPU operations to finish", like a gl.readPixels call for example) will result in us waiting for the worker thread's GPU work to finish as well.

TL;DR: For pure speed GPU > CPU, but for parallelization CPU > GPU. Workers can occasionally help, but won't be easy/straightforward given how GPU-reliant we are by default.

tyrmullen avatar Oct 11 '21 19:10 tyrmullen

Thanks @tyrmullen I'll just note that the largest, and most apparent, and annoying main thread block is loading the holistic model, not inference, in my case. When it is loading, nothing else in my app works, and that sucks because it happens at a time when people do want to interact with the app. That's my main use case here, and secondary is inference block.

AmitMY avatar Oct 12 '21 05:10 AmitMY

No problem! And yes, that makes sense, and workers could probably help for that-- I'd be curious to see just how helpful they can be on this front (they should allow us to make those "wait for GPU" ops be non-blocking, but would still impact the GPU queue).

One thing to note is that loading the model data is likely taking very little time itself (or it'd show as CPU processing time), but rather the initialization of the ML inference engine is quite expensive. I suspect the concentration of loading time you're seeing will be spread between two spikes: (a) one occurring during the MediaPipe graph initialization and (b) one occurring during the first frame processed (the first 'send' call), probably right afterwards. That's because shader program instantiation is likely the main cost here, and WebGL lazily enforces shader initialization, so while a lot of work is done up front, the process doesn't complete until the first frame is rendered. If your performance profiling matches this pattern, then this is almost certainly the culprit.

tyrmullen avatar Oct 12 '21 19:10 tyrmullen

Can confirm this is still an issue in version 0.5.1635989137

AmitMY avatar Dec 07 '21 08:12 AmitMY

My personal feeling is, pose Initialize () does not complete the real initialization. One trick is to send a non empty video element in advance before playing the video. However, a moment of Caton cannot be avoided in any way (including worker).

I hope there is a way to reduce the impact of real initialization on the window. I mean the blocking of interface rendering rather than time-consuming. Even if it takes more time, it is worth it (because there are ways to cover up the disadvantage of time-consuming, but there is no way to solve the blocking of rendering

Web worker is not the key

fgsreally avatar Jan 11 '22 15:01 fgsreally

Any update on this from the maintainers? If the mediapipe JS code will be open source, I'm sure someone (perhaps even I) could solve it.

AmitMY avatar Apr 20 '22 13:04 AmitMY

I second @AmitMY , it would be great if the mediapipe JS could run in a worker, and send resolving with the actual results. Would fit nicely with the insertable streams API!

// worker.js
const holistic = new Holistic(/* ... */)
const canvas = new OffscreenCanvas(/* ... */)

const transformer = new TransformStream({
    async transform(videoFrame, controller) {
        const results = await holistic.send({ image: videoFrame })
        // draw results onto canvas and
        const newFrame = new VideoFrame(canvas)
        controller.enqueue(newFrame)
    }
})

self.onmessage = (message) => {
    // main thread transfers a MediaStreamTrackProcessor.readable and MediaStreamTrackGenerator.writable
    const { readable, writable } = message.data
    readable.pipeThrough(transformer).pipeTo(writable)
}

tvanier avatar Apr 20 '22 17:04 tvanier

I created the webworker demo using the mediapipe models (face, hand, pose). https://d3iwgbxa9wipu8.cloudfront.net/P01_wokers/t18_mediapipe-mix/index.html

(Note: some part of process is guess-work.) The repo is linked at the upper right icon.

w-okada avatar May 15 '22 12:05 w-okada

@w-okada Seems like you are using @tensorflow-models and not mediapipe directly. Please correct me if I am mistaken. While that can work, this issue is specifically about running mediapipe models.

AmitMY avatar May 16 '22 09:05 AmitMY

@AmitMY I use the model listed at the page below. https://google.github.io/mediapipe/solutions/models.html

w-okada avatar May 16 '22 12:05 w-okada

Has anyone found a workaround for the path-resolution on web workers?

KTRosenberg avatar Jun 16 '22 20:06 KTRosenberg

@tyrmullen Running mediapipe on web worker may not do much on high-end devices, but it does make a difference on lower-end devices. Say if your device can only run mediapipe at below 20fps, if you run everything on the main thread, your graphics rendering will also be stuck at the same frame rate (the cause can be the CPU overhead used by the mediapipe models, or it's the way the GPU resources are shared in the main thread). However, if you put mediapipe on a web worker, you can keep your main thread rendering at 30fps or higher with very little impact on the inference speed of the mediapipe models.

I have a web app that loads mediapipe on web workers. I put a stress test on a medicore smart phone, and although the inference speed is just around 5fps, the 3D graphics on the main thread is still at 30fps.

https://twitter.com/butz_yung/status/1519365509128417280/

ButzYung avatar Jun 24 '22 09:06 ButzYung

@ButzYung how did you manage to run it in a worker? Any trick we could all use?

AmitMY avatar Jun 24 '22 12:06 AmitMY

@AmitMY I took a similar approach as @AlexShafir did before by editing the beautified version of holistic.js (BTW I tried his method on 0.5.1635989137 some hours ago but it didn't seem to work anymore). Our edits are basically on the same location.

            case 6:
                e.h = p.h;
// hack for worker
/*
                a.l = new OffscreenCanvas(1,1);
                a.h.canvas = a.l;
                g = a.h.GL.createContext(a.l, {
                    antialias: !1,
                    alpha: !1,
                    aa: "undefined" !== typeof WebGL2RenderingContext ? 2 : 1
                });
                a.h.GL.makeContextCurrent(g);
                p.g = 4;
                break;
*/
            case 7:
// hack for worker
                a.l = ((typeof Document !== 'undefined') && document.createElement("canvas")) || new OffscreenCanvas(1,1);

My approach is to basically comment out the whole case 6, and do everything worker and non-worker on case 7. And on the first line, I edit it to use OffscreenCanvas whenever web worker is used.

As for the path issue, put the worker file in the same folder with holistic.js, just like AlexShafir's version did.

ButzYung avatar Jun 24 '22 14:06 ButzYung

@ButzYung That's what I meant by saying workers wouldn't help pure performance of the ML inference-- that the 5fps framerate would not increase (and might potentially decrease, depending). Workers can definitely help unblock the main thread, for fast and slow devices alike, but when it's GPU it's usually less about overhead and more about synchronization and waiting; Holistic has to wait on CPU for GPU to finish for every frame, while most pure rendering doesn't (and shouldn't). So that's why I said running on workers could definitely help overall app behavior, but not the inference framerate (or initialization/load time) [all this is assuming inference is running on GPU-- on CPU you get actual gains, often 2-3x factor, since the threads are relatively independent].

However, all that being said, I'm a bit surprised/impressed that you see so little impact to your main rendering thread given that I'd imagine the GPU should be quite fully utilized by the ML inference (if that part can only run at 5fps-- which is also quite low for GPU ML, even on moderate devices); that makes me suspect the inference is running on CPU and not GPU. Were you testing on an iPhone by any chance? (we force CPU inference there). I guess one way to check too would be to look at the Chrome Developer console performance traces and see if the worker looks like it's using a lot of WebGL calls at the end of the call stacks?

tyrmullen avatar Jun 24 '22 19:06 tyrmullen