Cavern icon indicating copy to clipboard operation
Cavern copied to clipboard

Release the XD

Open VoidXH opened this issue 10 months ago • 2 comments

Cavern XD is an experimental technology to recover and/or substantially increase dynamic range in any content. The dynamic range compression in streaming movies have became such a severe issue that it's now mandatory to handle the release of this research with priority.

VoidXH avatar Feb 13 '25 10:02 VoidXH

This is very interesting and innovative, but it also seems quite complex, given that the mastering of cinema versions of films to Blu-ray versions (or even worse, Internet or TV versions) involves highly variable treatments in terms of dynamics, frequency, and spatialization. For example, in a cinema mix (DCP output), the LFE channel is rarely used. In some blockbusters, there are only a handful of LFE interventions throughout the entire film. However, during Blu-ray mastering, engineers tend to filter the bass from the LCR channels and partially inject it into the LFE, based on the assumption that most home studios have a woofer instead of a proper sub, and speakers that are closer to tweeters rather than true full-range speakers. The impact on dynamics is such that post-filtering compression by channel group reduces the bass dynamics independently of the dynamics of the rest of each channel group (from low-mids to highs). It seems complex to me to increase the dynamics of a Blu-ray mastering to get closer to the original cinema mix without being able to recognize which channel each bass component originally belonged to once they have been mixed into the mono LFE channel. With Atmos, and especially spatial coding, things become even more complex. Some software, like Fiedler Audio’s Mastering Console in standalone mode or HoRNet SAMP as a plug-in, now allow for Atmos mastering, handling objects along with beds for dynamic processing analysis and then applying treatments themselves, with side-chain possibilities. As a result, the variability and complexity of these treatments make reverse-processing even more challenging, due to the variable handling of beds and objects, the merging of spatial coding after mastering, the different side-chain possibilities, and multi-band compressors. Such an algorithm would require advanced AI capable of analyzing the entire film to propose an adaptive multi-expander solution. However, a simple expander as compensation could be risky, as it would be unable to distinguish between compression applied during Blu-ray mastering and the compression already present in the original cinema mix.

AntPradZT avatar Mar 08 '25 12:03 AntPradZT

You found out how it works, it is an AI, based on Spleeter technology. One that is specialized to separate intense effects from the rest. Cavern XD is basically separating the movie to a base and a spike track, mixing back the base at a lower gain.

VoidXH avatar Mar 08 '25 12:03 VoidXH

I am very interested in this, dynamic range is my largest problem with modern music.

jonahnm avatar May 17 '25 18:05 jonahnm

Have you thought of using any of the newer architectures, such as bs-roformer?

Jarfeh avatar Jun 09 '25 04:06 Jarfeh

Have you thought of using any of the newer architectures, such as bs-roformer?

Its API is not production ready.

VoidXH avatar Jun 09 '25 08:06 VoidXH

Moved to CavernPro repo.

VoidXH avatar Aug 19 '25 19:08 VoidXH

What's CavernPro? Some sort of paid product / private repo?

nift4 avatar Aug 22 '25 18:08 nift4

It will be an online service.

VoidXH avatar Aug 22 '25 18:08 VoidXH