Lime-3DS-Emulator
Lime-3DS-Emulator copied to clipboard
Add option to allow for audio speed + pitch to match playback speed
Is there an existing issue for this?
- [X] I have searched the existing issues
Affected Build(s)
2110 - 2116
Description of Issue
When audio stretching isn't enabled, even at 95% speed, the audio is instead sort of made choppy with ugly sounding artifacts, almost like it plays at 100% audio speed but with interspersed moments of no audio.
So at 50% speed, it's like it's playing 48kHz audio with gaps of silence in-between every single sample...or something like that.
This also occurs even when set to LLE (accurate) audio or LLE multi-core audio (though admittedly I instead tested those a couple of weeks ago via the appimage of a slightly older build on a Xeon E3-1246 v3).
Expected Behavior
Much like Dolphin, DuckStation, and mGBA, the audio tempo should instead be adjusted to match the game speed. At the likes of 98%, the reduction in audio pitch is much more pleasing to the ear vs the artifacts you would get otherwise.
(I find that it's when I start hitting 96% or lower that the pitch change becomes more noticeable as long as you're not playing the different pitched audio back-to-back, and even then I don't find it to be "offensive" like I do the pops & crackles; it's not until I get to much lower percentages that the change in pitch really starts becoming an issue to me, and even then some people won't even notice pitch differences as dramatically as I myself do).
Using the 50% speed example again, this would be the same as simply playing 48kHz with a rate of 24kHz—that is, it's not downsampled but rather is the same thing as described in this HydrogenAudio forum thread:
- https://hydrogenaud.io/index.php?topic=119735
Reproduction Steps
-
Launch the Lime3DS flatpak while you're in a quiet environment
-
Set emulation speed to a value less than 100% (to use values that aren't a multiple of 5, manually edit the
config.inifile accordingly) -
Uncheck "enable audio stretching"
-
Turn up the volume on your speakers
-
Observe how the audio has ugly pops and/or crackling
Log File
I started with manually having the qt-config.ini file set to 98%, launched Donkey Kong Country Returns 3D, navigated to the in-game audio options, and then set sound effects to a volume of zero. After a few minutes, while the game was still running, I then set the emulation speed to 90%. After some more minutes, I closed Lime3DS.
System Configuration
CPU: Pentium G3258 @ 4.5GHz (essentially an overclockable Haswell i3 with a gimped iGPU and no SMT nor AVX) GPU/Driver: "Intel HD Graphics" (the gimped version found on lower Haswell models); mesa v23.3.6 kisak1~f; Vulkan RAM: 4x8GB Corsair Vengence @ DDR3-1600 OS: Linux Mint Xfce 20.3
Audio Recordings
I've attached some example audio recordings of the main menu from Donkey Kong Country Returns/3D comparing how Dolphin and Lime3DS handle less-than-100% speed when audio stretching is disabled and with volume matched perceptually via r128gain (note that all of the following "WebM" files are actually FLAC files manually renamed to a .webm extension since github is dumb and doesn't accept FLAC files, but as long as your web browser supports FLAC playback via HTML5 then your browser shouldn't care about the mismatched file extension):
Remember to unmute the audio for each media file!
So as to prevent github-actions from marking this as "stale", I just want to double-confirm that this still occurs in the current 2116 release.
Sorry, this issue went completely under my radar until your recent comment.
Hm, I'm not sure I would really consider this a bug? I'm also unsure what the benefit of this behaviour would be outside of simply using stretched audio if you desire for the audio to be smooth. Ignoring the technical details and looking at this purely from an end-user perspective, quote-unquote "fixing" this would seemingly change the settings from 'stretched/unstretched' to 'stretched/stretched but it sounds like a lofi remix'.
I don't really see the benefit of this.
I'm also unsure what the benefit of this behaviour would be outside of simply using stretched audio if you desire for the audio to be smooth.
In my experience, when you get to slower speeds like 40fps or 30fps, audio stretching still doesn't work as well well as just naturally letting the audio down-pitch.
On a more technical point that's more obscure, in some games, you can actually alter the music data to then have a higher playback rate that will then be canceled out by the emulated speed running at a slower speed, resulting in (nearly?) perfect-sounding audio but at a slower emulated speed, especially in games using ADPCM audio which can have said audio hex-edited in the exact same manner I describe at the end of this comment (though the location of the corresponding hex value can differ of course).
"fixing" this would seemingly change the settings from 'stretched/unstretched' to 'stretched/stretched
I am confused by this? In the Dolphin examples, there is no stretching occurring at all and it's natively directly playing the audio as-is, albeit at a different sample rate which automatically results in an altered pitch because pitch is purely determined by the length of a given wave, so slower = longer = lower pitch.
it sounds like a lofi remix
I am also confused by this. Lofi would be the opposite of hifi, meaning low fidelity and high fidelity respectively. But there is no change in audio fidelity with the Dolphin examples as that's one of the key points of this method of playback—it preserves the fidelity, unlike what happens in Lime3DS currently either with "audio stretching" disabled (pops & crackles) or enabled (any sort of pitch-correction becomes particularly more detrimental to fidelity the more you slow down the speed).
—————————————————
I'm kind of getting the idea that maybe you're not that familiar with how audio waveforms work and that I may be sort of "talking over your head"? Maybe hex-editing would be more a understandable example for you? Here's the 90% Dolphin FLAC converted to LPCM WAV but downmixed to mono in order to fit within github's 10MB limit:
Again, note that this is not an MP4 but rather a genuine WAV file that was simply renamed to have a .mp4 file extension (this is also not a case of PCM audio in an MP4 container); therefore you will need to unmute if you wish to play it in-line in your browser.
In a hex editor, find the 24th hex offset (48th character?), where it says 80 BB which is 48000 in little-endian hex referring to the audio sample rate; change it to 55 D0 which is 53333 in little-endian hex. Save your changes and play your resulting edited wave file—note how, simply by having changed 4 tiny characters, you've "magically" made the entire song now play at normal pitch and speed (albeit with a weird non-standard sample rate of 53333Hz, but real-time resampling is essentially a non-issue nowadays).
The point being that, at least in the audio world, it should actually take much less effort to do this sort of "pitch matches the speed" behavior rather than applying any sort of pitch correction to maintain pitch regardless of speed aka what Lime3DS calls "audio stretching" (not to mention the aforementioned detrimental impact that audio stretching can have on fidelity at substantially slower speeds).
Sorry for the delay getting back to this.
We aren't interested in having what you described replace the existing behaviour when audio stretching is disabled. It's possible that it could be added as an alternative option, however this would be low priority.
I will convert this issue to reflect this updated status.
To help me understand better, do you think you could explain to me the use-case and/or benefit of the current resulting behavior you get when audio stretching is unchecked?
Because, while it can be desirable in terms of being "completionist" in terms of features and just making everything and the kitchen sink available, this still kind of feels to me like the equivalent of making available the option to format your hard drive whenever you press the emulated A button whereby it is something the user could presumably do* but the other alternatives are preferred to such a 99.999999...% majority of the time that I can't imagine a scenario where it's actually preferred.
*might require recompiling for this specific extreme example though :P
Perhaps a more comparable example was how, many years ago before "integer scaling" actually got implemented into GPU drivers, I was vouching for the idea that the "centered" scaling mode should basically just be replaced with an "integer" scaling option since, in my mind, "integer" scaling would simply be a better upgrade of the existing "centered" scaling option with zero downgrades:
- https://old.reddit.com/r/Amd/comments/55hb0u/lets_get_integer_nearest_neighbor_gpu_scaling
- https://old.reddit.com/r/Amd/comments/55hb0u/lets_get_integer_nearest_neighbor_gpu_scaling/d8b6xnj
In other words, I would think that having the options for the behavior I'm suggesting (as well as retaining the current behavior when "audio stretching" is checked) would basically fully supersede the current behavior, making the current "disabled audio stretching" almost a sort of "orphaned" function.
It's simple really, the current behaviour is just more faithful to the original experience of playing the game.
This pitch reduction behaviour isn't something that would be present on the original hardware in any capacity, and if this replaced the existing feature, users who don't like how the lowered pitch effect sounds will be forced to use stretched audio, even if they previously didn't want to.
You seem to come from an audio background and are approaching this from a very objective angle, but it's important to acknowledge that this is an area where the best option is entirely up to personal taste, and replacing existing options with ones which give a very different experience simply due to being quote-unquote "better" isn't the approach that should be taken.
While it's not 100% accurate to say I come from an audio background, I will say that audio along with computer hardware (not software) are my two strong specialties. Software, especially anything regarding software development, is an extreme weak-point for me (yet I've a keen eye for bugs and other usability issues?).
replacing existing options with ones which give a very different experience simply due to being quote-unquote "better" isn't the approach that should be taken.
I do suppose this is true, but this is almost more of a UI thing in terms of changing what people previously expected (though I could make the argument that other bigger-name emulators already operate this way).
But regardless, the biggest thing I'm confused with on the current implementation is the benefit of ever having audio stretching unchecked if it isn't going to alter the pitch since, in my mind, the current implementation is akin to:
- unchecked = do poor quality audio stretching
- checked = do good quality audio stretching
This is kind of what I mean that I'm confused as to the utility of the current unchecked "audio stretching" behavior as, musically, is the end-result not the same, just worse quality if "audio stretching" is unchecked and you aren't running at a solid locked 100% speed?
...unless having audio stretching checked incurs additional computational demands? Kind of like the difference between anti-aliasing algorithms of FXAA (not great) and SMAA (better but not quite as fast). But if this were the case, then I would think the UI for this specific option would be better to instead be something like radio buttons under a "audio stretching quality" heading that lets you select between poor (faster) and good (slower), with poor (faster) matching the current unchecked "audio stretching" behavior and good (slower) matching the current checked "audio stretching" behavior.
2025 EDIT: Or more easily understood with image scaling, where the current unchecked option seems more like nearest neighbor while checked is like cubic (in fact, that's basically the same terminology used in audio interpolation, like with SNES emulation).
...that's probably a better way to describe the current option - it's not audio stretching, it's audio interpolation as the audio is stretched regardless, but unchecked it's without interpolation ("ugly" stretching) and checked it's with interpolation ("nicer" stretching).