docs: change post processing example
It now displays both how to match getByteFrequencyData() and getFloatFrequencyData()
The latest updates on your projects. Learn more about Vercel for Git ↗︎
| Name | Status | Preview | Updated |
|---|---|---|---|
| remotion | ✅ Ready (Inspect) | Visit Preview | Nov 7, 2022 at 9:41AM (UTC) |
Seems like a good change, can we briefly explain when one should use the one over another? Is this for visualizing voice vs. music?
Both are for making frequency visualizations (usually "bars").
This will just make Remotion compatible with all the WebAudio tutorials on the internet (most use getByteFrequencyData). I'm guessing the person who asked about getFloatFrequencyData was following a tutorial that used that function.
The missing API are getFloatTimeDomainData and getByteTimeDomainData. Both should be possible to pull out of the existing FFT code since calculating them is a required prerequisite for later calculating the FFT.
(from W3C WebAudio spec)
Before we returned one array that the example claims makes the visualization nicer than the default. Now we are declaring two arrays, not returning anything, and not adding any context. I think this is confusing, which one do we recommend?
Would it make sense to make this an additional snippet instead?
Hmm, if the purpose of the page is to just provide a copy-paste snippet then we should remove the W3C-related info and move it to another page.
The PR was more for making visualizeAudio() compatible with the quite large amount of WebAudio AnalyserNode tutorials that are out there. But that information could definitely be somewhere else, as long as it's searchable.
If that sounds good I'll edit the PR for a suggested change.
A note on "good looking" frequency visualizations.
A pleasing looking frequency visualization will do two main things:
- Present audio level on a logarithmic scale, often decibels.
- Present frequencies on an exponentially increasing scale or on a musical scale (which really is kind of the same).
Both of these are logically unintuitive, but incredibly intuitive to our hearing and vision since we experience loudness and frequencies logarithmically, but more often like to see a visual representation that looks linear. It's one of those things that just "feels a little off" unless presented correctly.
Presenting it well requires a high enough frequency resolution so we can sample the right things. A FFT window of 2048 (the default in AnalyserNode) will give a frequency bin width ("resolution") of ~21.5hz for a 44.1khz signal.
But... if displaying a frequency spectrum that feels like it follows the music isn't important then all of this doesn't really matter. 😄