the_tuul
the_tuul copied to clipboard
Run ffmpeg in browser
Use ffmpeg.wasm to create the video in the frontend rather than on the server. The endgame is to have the whole tool run in the browser so that server costs are 0. Music separation has to happen server-side currently, but we can at least move video creation out of the backend.
The ffmpeg.wasm project is amazing. The DX is pretty good but lacks documentation about exactly what types of data are in its input/output. It's easy, with JS, to get lost in Files, Blobs and UInt8Arrays. My first step is to get it to make a video using the original sound file, and that mostly works except it doesn't display the lyrics because it can't find the Arial Narrow font they are supposed to be displayed in.
A 3-minute video takes about 3 minutes to generate on a 2017 Intel Macbook Pro. Not bad!
Here's the main code:
async createMpeg() {
const songFileName = "stuff.mp3";
this.isSubmitting = true;
const ffmpeg = createFFmpeg({ log: true });
await ffmpeg.load();
// Write audio to ffmpeg-wasm's filesystem
await ffmpeg.FS(
"writeFile",
songFileName,
new Uint8Array(await readFileAsync(this.songFile))
);
// Write the subtitle font to the filesystem
await ffmpeg.FS(
"writeFile",
"Arial Narrow.ttf",
await fetchFile("/static/ArialNarrow.ttf")
);
await ffmpeg.FS("writeFile", "subtitles.ass", this.subtitles);
await ffmpeg.run(
"-f",
"lavfi",
"-i",
"color=c=black:s=1280x720:r=20",
"-i",
songFileName,
// Add subtitles
"-vf",
"ass=subtitles.ass:fontsdir=./",
"-shortest",
"-y",
"karaoke.mp4"
);
// video is a Uint8Array
const video = await ffmpeg.FS("readFile", "karaoke.mp4");
const anchor = document.createElement("a");
const filename = this.zipFileName;
anchor.style.display = "none";
anchor.href = URL.createObjectURL(new Blob([video]));
anchor.download = filename;
anchor.click();
},
I tried updating to ffmpeg.wasm 0.12 but that was a disaster -- it's a total rewrite that seems to be vite-based and doesn't work with webpack. I'm open to switching to Vite but it seems to not work with CommonJS modules, which jsmediatags uses. After much faffing, I switched back to 0.11 because it does work.
But it breaks JavascriptSubtitlesOctopus. Basically, ffmpeg.wasm uses SharedArrayBuffer
which is not enabled unless response["Cross-Origin-Embedder-Policy"] = "require-corp" and response["Cross-Origin-Resource-Policy"] = "same-site"
. For some reason, setting those headers in index.html breaks the loading of the Web Worker that JavascriptSubtitlesOctopus requires. In the Chrome debug console it says (blocked:CoepFrameResourceNeedsCoepHeader)
when loading subtitles-octopus-worker.js
. As far as I can tell, this kind of blocking should only happen when requesting scripts from different origins, and since the worker.js script is being loaded from the same origin (localhost:8000) as the index.html file, I'm not sure what the problem is. But these headers are complicated so maybe my mental model is wrong. I've experimented with ways to bundle the web worker into the main js bundle, but Webpack 5 changed the way it bundles web workers and I'm not sure if it's possible to do what I need.
I think I fixed that. First off, when DEBUG = True, Django serves static files, bypassing WhiteNoise completely, and bypassing the custom callback that WhiteNoise uses to add the right headers to the assets. That can be fixed. Second, when WhiteNoise is working, it reads from STATIC_ROOT, which is only populated when ./manage.py collectstatic
is run, which I don't do in dev mode. At this point I will make a Makefile or something to ensure that collectstatic
always runs before runserver
. Is there a better way?
Shipped in 0.9.0. Decent speed on my M1 MacBook Pro, but it's using the old version of ffmpeg-wasm. Updating to a newer version will allow us to experiment with threading. Still: pretty cool!