ffmpeg.js icon indicating copy to clipboard operation
ffmpeg.js copied to clipboard

Asynchronous stdin

Open qgustavor opened this issue 7 years ago • 32 comments

With the use of Web Streams videos could be downloaded then converted to other format without too expensive memory usage. I don't know if Emscripten supports that, as seems it blocks its thread, but if it's possible would be interesting.

qgustavor avatar Aug 02 '16 20:08 qgustavor

That's interesting question. There were discussions regarding live streams in #11 and #12 but using stdin for that haven't been considered.

Theoretically it might work. post-worker.js code need to be changed in order to be able to accept stdin events from main process. Then you start new ffmpeg command which should look like ffmpeg -f <input container> -i - -c <codec> -f <output container> -, pass input data and read processed data back with stdout events. There may be issues with binary data handling or some other unforseen pitfails but for the first glance it looks doable.

Do you want to try to implement this?

Kagami avatar Aug 02 '16 21:08 Kagami

Related: https://github.com/kripken/emscripten/issues/4124 https://github.com/kripken/emscripten/issues/23#issuecomment-1399857

Seems like Module.stdin is synchronous and we can't wait for onmessage call. Maybe if it's possible to periodically interrupt Emscripten and get opportunity to process events to worker's process code, then we can collect stdin data in some accumulator and synchronously send it to ffmpeg in Module.stdin handler later. Better to ask in Emscripten's bugtracker.

Kagami avatar Aug 02 '16 21:08 Kagami

Well... some months ago I tried it, but based in videoconverter.js code: I managed to change the command. As it's synchronous exactly this problem happened: it don't receive message calls until the ffmpeg process stops.

I tried to find some way to make ffmpeg async, like Asyncify and Emterpreter, without success. After some time I gave up. As you're working actively on this I thought you already knew a way to fix this problem. Seems it's not that simple. :confused:

qgustavor avatar Aug 02 '16 21:08 qgustavor

What exactly did you try with Asyncify and Emterpreter? Maybe if you add emscripten_sleep to ffmpeg's stdin handler (need to patch it for that) then worker process will have a chance to receive messages from main process. I haven't tried that and don't know much about Emscripten internals, just some wild guesses.

Kagami avatar Aug 02 '16 21:08 Kagami

I wanted do this, but in fact I don't even managed Emscripten to compile videoconverter.js: I use Windows, so shell scripts don't work. I tried to set up it in a Ubuntu machine but it also failed. Then I tried to set it using ffmpeg source directly, so I could use it on Windows.

After that, IIRC, I abandoned the idea, mostly because I was trying to implement it in Direct MEGA and after some problems I stopped working on this project.

qgustavor avatar Aug 02 '16 21:08 qgustavor

I can only suggest you to try to build ffmpeg.js in Ubuntu VM. You need to have emsdk installed (see here) and some basic dev packages. Then clone this project, checkout all submodules and type make ffmpeg-worker-mp4.js command (this target should be fastest to build).

Kagami avatar Aug 02 '16 22:08 Kagami

Understood. I will set up a VM and try it again.

qgustavor avatar Aug 03 '16 09:08 qgustavor

I tried in one of those lightweight Ubuntu distros, then it failed. I tried again with Ubuntu Server 16.04.1 i386 and I found that I was missing some steps and doing other wrong.

But it still failed. This was the last error that I couldn't find a solution: https://i.imgur.com/JZLWvWP.png I only found bug reports for this problem. From some of those maybe the problem is the distro, so ~~I will try to run the amd64 version~~ Edit: my computer doesn't support virtualization, so I can't run it. If you know the solution it will help a lot.

qgustavor avatar Sep 06 '16 19:09 qgustavor

I haven't encountered issues with lame build. I'm using 64bit distro/compilers though, so yes, that might be the case. You can also try without lame, it's not required for your tests: delete --enable-libmp3lame and build/lame/dist/lib/libmp3lame.so \ lines and try again.

Kagami avatar Sep 06 '16 19:09 Kagami

I tried without lame and in a 64bit distro (Cloud9). What I'm doing:

  • clonning the repository
  • git submodule init && git submodule
  • sudo apt-get install emscripten
  • installing auto-conf, automake and libtool.
  • make ffmpeg-worker-mp4.js

Is some step wrong?

qgustavor avatar Sep 07 '16 13:09 qgustavor

@qgustavor install pkg-config as well: sudo apt-get install pkg-config

bbf avatar Sep 30 '16 05:09 bbf

I retried in a fresh Cloud9 Ubuntu x64 instance and it still failed. There are the log files for make ffmpeg-worker-mp4.js and make ffmpeg-worker-webm.js.

qgustavor avatar Oct 11 '16 15:10 qgustavor

You're still building lame, see my previous comment how to disable it. Also, I realized what's wrong with lame. Its configure script checks for xmmintr (SSE intrinsics) and on your machine it returns true while for me it's false and so build system doesn't try to compile vectorized code which apparently has some issues on Emscripten. I think I'll need to patch lame's configure to fix that (or maybe find some other hack with configure/Emscripten options).

Kagami avatar Oct 11 '16 16:10 Kagami

I made a custom build of ffmpeg.js with support for only rawvideo input, the pipe protocol, mp4 and null output, and the H.264 encoder.

I works, but it looks like there's no way to send more than 1 byte on stdin at a time, so to read an 800x600 RGB frame, there are 1440000 calls to my stdin function, even though emscripten knows that ffmpeg wants 1440000 bytes. I profiled it and rendering the frame takes 13ms but the read syscall takes 28ms because of the iterative function call used to copy the buffer one byte at a time.

Try it at https://benlubar.github.io/cmv2mp4/ - worker.js is not minified, so it should be pretty easy to read.

Example input file: https://benlubar.github.io/cmvjs/ai_trade.cmv

BenLubar avatar Feb 16 '18 17:02 BenLubar

You could try throwing this in your ffmpeg command eg:

 type: "run",
              mounts: [
                {
                  type: "WORKERFS",
                  opts: {
                    blobs: [
                      { name: "input.webm", data: blob },
                      { name: `sound${AUDIO_EXP}`, data: sound },
                    ],
                  },
                  mountpoint: "/data",
                },
              ],
              TOTAL_MEMORY: 536870912, // !!!!!

samelie avatar May 30 '18 01:05 samelie

I have successfully implemented feeding frames via stdin. The main problem is that ffmpeg blocks the execution, so a worker can't process income messages.

You have to use either ASYNCIFY (didn't compile in my case at all) or EMTERPRETIFY_ASYNC to interrupt the ffmpeg process. You need to empretify the whole stack from main() to the reading function to be able to interrupt ffmpeg process.

You can take a look the result in my fork. https://github.com/Kukunin/ffmpeg.js/commit/acca0c151807d88bddd73e885a183750bd793246#diff-b67911656ef5d18c4ae36cb6741b7965R347

Also, there is I implemented emscripten_stdin_async(buf, size) function, which interrupts ffmpeg until there is input, so it becomes asynchronous.

I plan to prepare a PR to the upstream, but don't know when.

Kukunin avatar Sep 21 '18 08:09 Kukunin

@Kukunin Do you have any client-side (javascript) code examples? I am trying to figure out how to handle IO (sending arguments, retrieving output) with your fork.

jfizz avatar Sep 27 '18 21:09 jfizz

take a look into https://github.com/Kukunin/ffmpeg.js/blob/master/build/library.js. As you can see, I use Module['stdinAsync'] and Module['stdoutBinary'] there. Here you can see https://github.com/Kukunin/ffmpeg.js/blob/012368c685ff30e8dc63278d40380bf3fe9e5aad/build/pre.js#L31, that ffmpeg assigns almost every option to Module. So you just pass stdinAsync and stdoutBinary functions to ffmpeg object

Kukunin avatar Oct 21 '18 19:10 Kukunin

as an example, how to use it, you can take a look to this code:

      opts = {}; // other options here
      opts['stdinAsync'] = function(size, callback) {
        getMyInputSomehow().then((data) => {
          callback(data.subarray(0, size)) // ensure you pass not more than size
        });
      };
      opts["stdoutBinary"] = function(data) {
        const frame = Uint8Array.from(data);
        self.postMessage({"type": "frame", "data": frame}, [frame.buffer]);
      };
      ffmpeg(opts);

Kukunin avatar Dec 19 '18 16:12 Kukunin

@Kukunin Thanks for the fork, I was able to build it. Is it possible to use your fork in an ffmpeg-worker.js scenario (would like to use workerfs for the input file and stdout for the converted file)?

Any quick snippet for that?

kishorenc avatar Jan 19 '19 00:01 kishorenc

My fork doesn't break the current functionality (as far as I know), so you use the same configuration as you would do with the upstream. Configure stdinAsync or stdoutBinary according to your needs

Kukunin avatar Jan 19 '19 11:01 Kukunin

One note about my fork: it calls stdoutBinary callback only for write syscall with FD 1 (stdout). If C code calls printf or other functions, they will be processed as in the original. You can override print and printErr callbacks to catch the standard output.

Not sure, if this behavior is a bug or a feature =)

Kukunin avatar Jan 21 '19 03:01 Kukunin

Look at the diff in his branch, it should show you

mgosbee-zynga avatar Dec 17 '19 20:12 mgosbee-zynga

Thanks @Kukunin, cool stuff! I'm not sure if I want to integrate this into ffmpeg.js right now... I will experiment with it later.

Kagami avatar Apr 17 '20 23:04 Kagami

Asynchronous stdin support by @PaulKinlan using SharedArrayBuffer: https://github.com/PaulKinlan/ffmpeg.js/pull/1/files

Kagami avatar Apr 19 '20 18:04 Kagami

@Kukunin, is it possible to use both stdoutBinary and stdinAsync at the same time?

I'm finding that it only lets me use one or the other, and wonder if that might be due to Module['HEAPU8'] ?

nanook21 avatar Jul 26 '20 06:07 nanook21

they are independent so they work both at the same time. From your message, it's not clear for me, how exactly it lets you use only one of them. Is there any error?

Kukunin avatar Jul 26 '20 10:07 Kukunin

@Kukunin, oh that's interesting. I wonder if I'm doing something else wrong then? I think the only difference between my setup and yours is that I compiled using ASYNCIFY=1 and am using Asyncify.handleSleep, because EMTERPRETIFY_ASYNC wouldn't compile for me.

Is there anything obvious here that I'm doing wrong here?

Makefile:

EMCC_COMMON_ARGS = \
        -O3 \
        --closure 1 \
        --memory-init-file 0 \
        -s WASM=1 \
        -s ASYNCIFY=1 \
        -s 'ASYNCIFY_IMPORTS=["emscripten_binary_read", "emscripten_binary_write"]' \
        -s WASM_ASYNC_COMPILATION=0 \
        -s ASSERTIONS=0 \
        -s EXIT_RUNTIME=1 \
        -s NODEJS_CATCH_EXIT=0 \
        -s NODEJS_CATCH_REJECTION=0 \
        -s TOTAL_MEMORY=67108864 \
        -lnodefs.js -lworkerfs.js \
        --pre-js $(PRE_JS) \
        --js-library $(LIBRARY_JS) \
        -o $@

library.js:

mergeInto(LibraryManager.library, {
  emscripten_binary_read: function(buf, size) {
    console.log('step 1');
    return Asyncify.handleSleep(function(wakeUp) {
      console.log('step 2');
      Module['stdinAsync'](size, function(data) {
        console.log('step 3');
        var finalSize = Math.min(size, data.length);
        Module['HEAPU8'].set(data.subarray(0, finalSize), buf);
        wakeUp(finalSize);
      });
    });
  },

  emscripten_binary_write: function(buf, size) {
    console.log('stdout binary call');
    Module['stdoutBinary'](Module['HEAPU8'].subarray(buf, buf + size));
    return size;
  }
});

post-worker.js:

      opts['stdinAsync'] = function(size, callback) {
        console.log('worker 1');
        var xhr = new XMLHttpRequest();
        xhr.open("GET", 'https://server/file.mp4');
        xhr.responseType = "arraybuffer";

        xhr.onload = function() {
          console.log('worker 2');
          if (this.status === 200) {
            console.log('worker 3');
            var data = new Uint8Array(xhr.response);
            console.log('size:');
            console.log(size);
            console.log('worker 4');
            console.log(data.subarray(0, size));
            callback(data.subarray(0, size)) // ensure you pass not more than size
          }
        };
        xhr.send();
      };
      opts["stdoutBinary"] = function(forStdout) {
        var frame = Uint8Array.from(forStdout);
        self.postMessage({"type": "stdoutBinary", "data": frame}, [frame.buffer]);
      };

I'm finding that the console.log in emscripten_binary_write only writes 3 quick times at the beginning and then stops. But the emscripten_binary_read logs continue to write.

nanook21 avatar Jul 26 '20 16:07 nanook21

@Kukunin would it be possible to update the readme/makefiles of your fork for it to work now ? I'm having problems after problems trying to build it while the current ffmpeg.js / @PaulKinlan's fork builds fine

Banou26 avatar Aug 29 '20 23:08 Banou26

@Kukunin @nanook21 @Kagami Hello people, i understand nothing about this issue, i need to know how to feed ffmpeg with webm chunks from mediarecorder blobs. i do not know which asm output i will use, can you guys explain me with basic code if possible ? i have mediarecorder chunks. please show me how to feed ffmpeg with webm chunks please. i need stdout chunks like nodejs spawn module

mediaRecorder.ondataavailable=function(e) { if(e.data&&e.data.size>0) { e.data.arrayBuffer().then(buffer=>{ const chunk = new Uint8Array(buffer) ffmpeg("show me how to feed with chunk and get stdout data"); } } }

mediaRecorder.start(1000) // i get chunks every 1 sec

civilianatpoint avatar Jan 06 '22 12:01 civilianatpoint