remotion icon indicating copy to clipboard operation
remotion copied to clipboard

docs: change post processing example

Open marcusstenbeck opened this issue 3 years ago • 5 comments

It now displays both how to match getByteFrequencyData() and getFloatFrequencyData()

marcusstenbeck avatar Nov 07 '22 09:11 marcusstenbeck

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Updated
remotion ✅ Ready (Inspect) Visit Preview Nov 7, 2022 at 9:41AM (UTC)

vercel[bot] avatar Nov 07 '22 09:11 vercel[bot]

Seems like a good change, can we briefly explain when one should use the one over another? Is this for visualizing voice vs. music?

JonnyBurger avatar Nov 07 '22 09:11 JonnyBurger

Both are for making frequency visualizations (usually "bars").

This will just make Remotion compatible with all the WebAudio tutorials on the internet (most use getByteFrequencyData). I'm guessing the person who asked about getFloatFrequencyData was following a tutorial that used that function.

The missing API are getFloatTimeDomainData and getByteTimeDomainData. Both should be possible to pull out of the existing FFT code since calculating them is a required prerequisite for later calculating the FFT.

image (from W3C WebAudio spec)

marcusstenbeck avatar Nov 07 '22 10:11 marcusstenbeck

Before we returned one array that the example claims makes the visualization nicer than the default. Now we are declaring two arrays, not returning anything, and not adding any context. I think this is confusing, which one do we recommend?

Would it make sense to make this an additional snippet instead?

JonnyBurger avatar Nov 08 '22 16:11 JonnyBurger

Hmm, if the purpose of the page is to just provide a copy-paste snippet then we should remove the W3C-related info and move it to another page.

The PR was more for making visualizeAudio() compatible with the quite large amount of WebAudio AnalyserNode tutorials that are out there. But that information could definitely be somewhere else, as long as it's searchable.

If that sounds good I'll edit the PR for a suggested change.


A note on "good looking" frequency visualizations.

A pleasing looking frequency visualization will do two main things:

  1. Present audio level on a logarithmic scale, often decibels.
  2. Present frequencies on an exponentially increasing scale or on a musical scale (which really is kind of the same).

Both of these are logically unintuitive, but incredibly intuitive to our hearing and vision since we experience loudness and frequencies logarithmically, but more often like to see a visual representation that looks linear. It's one of those things that just "feels a little off" unless presented correctly.

Presenting it well requires a high enough frequency resolution so we can sample the right things. A FFT window of 2048 (the default in AnalyserNode) will give a frequency bin width ("resolution") of ~21.5hz for a 44.1khz signal.

But... if displaying a frequency spectrum that feels like it follows the music isn't important then all of this doesn't really matter. 😄

marcusstenbeck avatar Nov 08 '22 17:11 marcusstenbeck