Release the XD
Cavern XD is an experimental technology to recover and/or substantially increase dynamic range in any content. The dynamic range compression in streaming movies have became such a severe issue that it's now mandatory to handle the release of this research with priority.
This is very interesting and innovative, but it also seems quite complex, given that the mastering of cinema versions of films to Blu-ray versions (or even worse, Internet or TV versions) involves highly variable treatments in terms of dynamics, frequency, and spatialization. For example, in a cinema mix (DCP output), the LFE channel is rarely used. In some blockbusters, there are only a handful of LFE interventions throughout the entire film. However, during Blu-ray mastering, engineers tend to filter the bass from the LCR channels and partially inject it into the LFE, based on the assumption that most home studios have a woofer instead of a proper sub, and speakers that are closer to tweeters rather than true full-range speakers. The impact on dynamics is such that post-filtering compression by channel group reduces the bass dynamics independently of the dynamics of the rest of each channel group (from low-mids to highs). It seems complex to me to increase the dynamics of a Blu-ray mastering to get closer to the original cinema mix without being able to recognize which channel each bass component originally belonged to once they have been mixed into the mono LFE channel. With Atmos, and especially spatial coding, things become even more complex. Some software, like Fiedler Audio’s Mastering Console in standalone mode or HoRNet SAMP as a plug-in, now allow for Atmos mastering, handling objects along with beds for dynamic processing analysis and then applying treatments themselves, with side-chain possibilities. As a result, the variability and complexity of these treatments make reverse-processing even more challenging, due to the variable handling of beds and objects, the merging of spatial coding after mastering, the different side-chain possibilities, and multi-band compressors. Such an algorithm would require advanced AI capable of analyzing the entire film to propose an adaptive multi-expander solution. However, a simple expander as compensation could be risky, as it would be unable to distinguish between compression applied during Blu-ray mastering and the compression already present in the original cinema mix.
You found out how it works, it is an AI, based on Spleeter technology. One that is specialized to separate intense effects from the rest. Cavern XD is basically separating the movie to a base and a spike track, mixing back the base at a lower gain.
I am very interested in this, dynamic range is my largest problem with modern music.
Have you thought of using any of the newer architectures, such as bs-roformer?
Have you thought of using any of the newer architectures, such as bs-roformer?
Its API is not production ready.
Moved to CavernPro repo.
What's CavernPro? Some sort of paid product / private repo?
It will be an online service.