dosbox-staging icon indicating copy to clipboard operation
dosbox-staging copied to clipboard

Sound architecture enhancements

Open johnnovak opened this issue 3 years ago • 22 comments
trafficstars

Motivation

This proposal is about improving the sound emulation accuracy of Sound Blaster cards, and making the overall DOSBox Staging audio experience more pleasant by adding a few important enhancements. In fact, with these enhancements in place Staging will probably be the best DOSBox variant for audio enthusiast.

Full disclosure: I've been an audio nerd, music producer, and hobbyist audio engineer for that past 30 years. I'm also relatively new to the codebase, so I'd much rather plan things carefully than to mess around and redo things 3-4 times. I'm also a big fan of this saying: "Give me six hours to chop down a tree and I will spend the first four sharpening the ax." So please bear with me 😉 I promise I'll try to be more succint next time if people find this too long to read, but I like to explain my ideas thoroughly.

So, grab some drinks and snacks and make yourself comfortable! 😎 🍹 🍎

Tandy / PC Speaker / other "small speaker" emulation

I use headphones 99% of the time when using the computer and playing games these days. While emulated MT-32 and SC-55/General Midi music sounds stellar on my high-quality studio headphones, the various speaker options, such as the Tandy emulation, leave a lot to be desired.

One part of the problem is that on real hardware you wouldn't stick your ears close to the little Tandy or PC speaker when playing a game. Firstly, those speakers are of really low quality, so they naturally filter out a lot of harmonics (and bass too). Secondly, normally they're at least a meter away from you (if not more), and you're listening to them off-axis, so there's a lot of air dampening and natural room reverberation going on in a typical domestic room. The end result is that the high-frequency response is significantly dampened, and the room acoustics adds a pleasant smearing/reverberation effect to the raw sound, softening it a lot. This video demonstrates it well how these little speakers sound in a real acoustic environment.

Compare that experience to listening to any Sierra adventure with Tandy, Game Blaster or PC speaker emulation on good quality headphones. The raw oscillator sounds give me ear fatigue after a few short minutes; it just gets on my nerves even when using super low volumes and makes me want to switch off the sound altogether...

Sound Blaster

Analog output filters

Currently, the analog output filters of Sound Blaster cards are not emulated. This is far from ideal for multiple reasons:

  • It's not accurate. NukeOPL emulation is virtually 100% identical to hardware OPL-3, Munt gets you very close to a real Roland MT-32, and the reSID library used for the Innovation emulation is equally stellar in terms of accuracy. I don't think we should settle for less for the Sound Blaster family of cards, arguably the most iconic pieces of audio gear in PC gaming history.

  • Objectively inferior results. Some people think that by not emulating the filters we suddenly get "hi-fi sound". There's no arguing with taste or personal preference, but I'd say not emulating the filters is objectively incorrect. Game developers had to take the filter into account when creating the sounds for their games. E.g. if there's a fixed 3.2kHz low-pass filter sitting on the output as in the case of the SB Pro, you would need to boost the highs a lot if you want a bright sound. Without a filter emulation these sounds are overly harsh and the noise inherent to those low-quality recordings is quite apparent. Another major issue is digital foldback-aliasing. Many of the earlier games used 11kHz samples and the low-pass filter was designed to get rid of a lot of that digital aliasing unpleasantness. The filter also makes the sound effectively bassier -- those shotgun shots and explosions in games like DOOM only sound ballsy, heavy, and dangerous with the 3.2kHz LPF of the SB Pro!

  • Preserving gaming history. How much longer until all those old Sound Blaster boards eventually die, and so do the people who still remember how they actually sounded like? We should strive to preserve our gaming history accurately while we still can, I feel quite strongly about that.

  • Nostalgic reasons. If you're used to the heavily filtered sound of an SB Pro, raw unfiltered sound emulation definitely won't put you in the same good mood. Enough said! 😁

A most prestigious Vogons user has created comparison recordings of the PCM and OPL outputs of various games using his hardware SB1.5, SB2.0, SB16 and SBPro boards. He has even measured and documented the exact specifications of the filters of the different models.

A lesser issue is that some SB models also filtered the OPL path somewhat as pure FM synthesis tends to be quite treble and overtone heavy. This is made worse on high-quality headphones; in some games you really need a low-pass filter to prevent ear-ringing.

For these reasons, the consensus on the "ultimate Sound Blaster setup" on the Vogons forum is to actually put not one, but no less than two Sound Blaster cards into your retro-PC:

  • A Sound Blaster Pro for its characteristic heavily low-pass filtered sound at 3.2kHz; this should be used in most games for PCM output

  • A much cleaner sounding Sound Blaster 16 board for CD audio and wavetable MIDI stuff. Even better yet, an AWE32 or AWE64 that allows you to route the OPL output through the onboard EMU-8000 DSP (on certain revisions), which lets you add reverb and chorus to the OPL music! (more on that later)

Crossfeed

Many games feature MIDI or OPL music, and only use the Sound Blaster for mono speech and sound effects. But with later games that support stereo Sound Blasters, the panning can sound very unnatural in headphones. The biggest problem are games featuring MOD music that hard-pan instruments 100% left and 100% right. This can be remedied by a simple crossfeed algorithm.

Dual-OPL games with stereo music have the exact same problem and would benefit greatly from the crossfeed too. The Game Blaster also uses hard-panned channels, so it's the same story.

Proposed solution

To improve both emulation accuracy of sound hardware and to provide a much more pleasant headphone listening experience, the following DSP effects would be incorporated into the signal path:

  • Low-pass filters -- to make small-speaker and Sound Blaster emulation more pleasant and accurate
  • Crossfeed processor -- to remedy the hard-panned stereo problem
  • Reverb -- to provide room-emulation for small-speakers, and to optionally enhance the audio for devices that don't have a built-in reverb like the MT-32
  • Chorus -- while we're at it, we'll throw in a chorus effect as well; that way we can add similar spatial DSP processing to OPL music like all those cool guys with the AWE32/64 boards! 🤘🏻😎🤘🏻

Filter

The low-pass filters should be turned on by default for small-speaker emulations and for the Sound Blaster. The exact SB filter to use is automatically selected based on sbtype. It's possible to override these settings with a filter from another SB model, or advanced users can provide their own filter parameters (we need to calculate the filter coefficients based on the host sample rate at runtime anyway, so it's trivial to add support for this).

Based on my research, a simple 6dB/oct analog RC-filter emulation and a 2N-pole Butterworth biquad low-pass filter implementation are all we need. These are computationally lightweight and are practically zero-latency (a few samples, to be exact). It seems this is what we need for the different models:

  • sb1, sb2: no filter

  • sbpro1, sbpro2: fixed 12dB/oct LPF @ 3.2kHz

  • sb16: 48dB/oct LPF dynamically set at half the SB mixer rate (a brickwall filter is used on the real hardware, but emulating that would be too heavy computationally and it would incur some latency, and this sounds very close)

  • tandy: 12dB/oct LPF @ 6kHz

  • speaker: 48dB/oct LPF @ 10kHz

  • other small-speaker systems: some variation of the above; in case of these systems it's not about accuracy but about making the sound less harsh to listen to by softening the highs

  • OPL: optional 6 or 12dB/oct LPF @ 8-15kHz (this would be entiraly optional as some original SB boards or clones have a filter in the OPL path, some don't, some people want to mod their boards to remove the OPL filter, some people want the opposite and add one, etc.)

Here are a few good filter implementations:

6db/oct LPF (RC-filter):

  • https://github.com/VCVRack/Rack/blob/v2/include/dsp/filter.hpp#L9-L50=
  • https://github.com/tonioni/WinUAE/blob/master/audio.cpp#L456=
  • https://github.com/vishwam-aggarwal/firstorderFilter_Arduino_Butterworth_Bilinear/blob/master/firstorderFilter.cpp

2N-pole Butterworth LPF (for 12/24/48 dB/oct biquad filters):

  • https://github.com/adis300/filter-c/blob/master/filter.c
  • https://github.com/VCVRack/Rack/blob/v2/include/dsp/filter.hpp#L292-L434=

Reverb

The reverb is an interesting one; first I only wanted to include a simple reverb for room acoustics simulation for small-speaker audio devices (Tandy, PC Speaker, etc.) But as I played around with it a bit, it turned out to be quite versatile and it can add tons of ambience to OPL soundtracks, or make PCM speech and sound effects blend together a lot better with MT-32/SC-55 soundtracks. Definitely not "authentic" and "period-correct", but if the reverb is already there, why not use it for something cool and useful?

Incidentally, lucky owners of AWE32 and AWE64 boards have been doing this for ages: you can route the OPL output through the mixer on certain versions of those card, which allows you to add chorus and reverb with the onboard EMU-8000 DSP. Some users even go as far as buying a rack-mount studio spatial effects processor unit and running the output of their Sound Blasters through it! Luckily, we can emulate all that fancy hardware for no cost in the digital domain!

I've chosen the open-source mverb algorithm for inclusion, which is a very nice low-CPU consumption reverb that sounds a lot better than Freeverb in my opinon (Fluidsynth uses a variation of Freeverb). It was originally written in 2014 and it was quite CPU-efficient even back then. Compared to Freeverb, it particularly excels at simulating early-reflections which are important for simulating small acoustic spaces (important for the small-speaker emulations) and overall it sounds nicer and more natural for larger hall type spaces as well.

There's a downloadable VST plugin version of it for Windows, Mac and Linux, so I've spent the last week testing it in REAPER with a variety of games. I've come up with 4 general presets that work well in all sorts of games: tiny, small, medium and large. That's easy enough to understand, and I'm not a fan of overloading users with too many options.

Chorus

I've tried many open-source chorus algorithms, and I liked TAL Chorus best which is an emulation of the onboard chorus effect found on the iconic Roland Juno-60 synthesizer. Because of that, it's very well suited to adding a warm chorus sound to mono synthetic sources, such as OPL music. Best of all, it's very light on the CPU too.

Crossfeed

The crossfeed processor is a headphone-only feature, it should be off by default. I've experimented with a few open-source plugins that implement various fancy crossfeed algorithms, but frankly, the best sounding solution that has the least amount of artifacts is just simply mixing some amount of the left signal into the right, and vice versa. Super simple to implement, and makes a lot of difference!

(On the Amiga, I used to connected the left and right RCA outputs with a resistor to introduce crossfeed -- cheapest hardware effect ever! 😛)

Audio examples

General examples

This excerpt from the Alone in the Dark OPL soundtrack should give you a fairly good idea of what the individual effects do to the sound. The examples showcase the following:

  • Raw sound
  • With crossfeed only (50% strength)
  • With reverb only (large preset)
  • With chorus only (medium preset)
  • With crossfeed, reverb, and chorus (same settings as above)

You can get the audio files from here (example-*.mp3).

Game examples

This is a selection of game soundtracks I've been using to come up with my effect presets. I tried to choose games that best illustrate what benefits the DSP can bring to the audio experience. The raw output of the current Staging version is also provided for comparison. I've included some brief explanations and a few entirely subjective comments too.

You can get the audio files from here (game-*.mp3).

Leisure Suit Larry 1 (Tandy)

  • 12dB/oct LPF at 6kHz, tiny reverb preset
  • This reverb preset is specifically designed to give an acoustic environment to Tandy and PC speaker audio. Combined with a low-pass filter, it softens the sound a lot, making it much more pleasant to listen to.

Alone in the Dark (SB Pro 1, PCM + Dual OPL-2)

  • SB Pro 1 filter (12dB/oct @ 3.2kHz), 30% crossfeed, large reverb, medium chorus
  • The addition of reverb and chorus really bring the soundtrack to the next level, and the sound effects sound much grittier, darker, and foreboding with the strong low-pass filter of the SB Pro. Some light reverb on the PCM sounds help the sound effects blend better together with the music. The raw output is really sterile in comparison, especially on headphones.

Dune (SB Pro, PCM + Dual OPL-2)

  • SB Pro 1 filter (12dB/oct @ 3.2kHz), 50% crossfeed, medium reverb, high chorus
  • Same as above, but here the extreme stereo panning also needs to be tamed with the crossfeed.

Pinball Dreams (SB 16, PCM)

  • SB 16 filter (12dB/oct @ 11kHz), 50% crossfeed, large reverb
  • The filter makes the sound somewhat warmer, but the crossfeed makes the biggest difference. Without it, it's very unpleasant to listen to the hard-panned instruments in headphones. The large reverb makes the bombastic soundtrack even more... large and bombastic? 🤔🦖

Ultima Underworld (SB Pro 1, PCM + Dual OPL-2)

  • 50% crossfeed, large reverb, low chorus
  • While the MT-32 soundtrack is very good, I prefer the OPL-2 music for it's murkier and weirder atmosphere. But the raw and bone dry OPL-2 output just cannot give it justice -- which is remedied with the addition of the reverb and crossfeed which brings the OPL music really to life!

Prince of Persia (SB Pro 1, OPL-2)

  • medium reverb, medium chorus
  • Same deal as for Ultima Underworld: the MT-32 music is very competent but I like the OPL soundtrack better. Reverb makes the music a lot more atmospheric.

Dark Sun (GUS)

  • medium reverb
  • I prefer the GUS MIDI wavetables for this particular game. With the added reverb I think it outshines the MT-32 soundtrack.

Space Quest 3 (SB Pro 1, OPL-2)

  • small reverb, low chorus
  • In Space Quest 3, I prefer the OPL music to the MT-32 soundtrack (are you starting to see a pattern here?!). The small reverb and a touch of chorus adds just a little sense of space to the music in headphones without being too overbearing.

Configuration

Crossfeed

Crossfeed is disabled by default; it can be enabled globally in the mixer section. It is applied to all sound sources that can be stereo, except for the CD output.

[mixer]
crossfeed = on|off|0-100

0 results in no crossfeed (same as off), 100 results in stereo signals being converted to mono. on sets some good default value (e.g. 70).

Filter

Filters can be enabled per sound card section: sblaster, speaker, innovation and gus.

My preference would be to turn the filter on for Sound Blaster and small-speakers by default, but off for everything else. (Luckily, because SB16 is the default, that would result in a far lighter filtering that what you'd get on the SB Pro models, so most users who don't care much about such detail wouldn't even notice the difference.)

If the filter is on for the Sound Blaster, the exact filter to use is determined by the sbtype value:

[sblaster]
sbtype = sbpro1     # turns on the fixed 3.2k low-pass filter by default

You could always use a filter from a different model if you wish:

[sblaster]
sbtype = sb16
filter_pcm = sbpro1

Note that the Sound Blaster has two filters: one for the PCM output (filter_pcm), and another for the OPL sound (filter_opl). This reflects how the actual hardware works on both original SB cards and some clones.

Advanced users have the option to set any filter they want:

filter_opl = 1 10k      # 1-pole (6db/oct) LPF @ 10kHz
filter_pcm = 4 8200     # 4-pole (24db/oct) LFP @ 8200Hz

For the small-speaker types (speaker section), some sensible defaults will be used that softens the high-frequency response. Naturally, advanced users can specify their own filter settings.

innovation and gus are not filtered by default and only support the advanced option (just because it's so easy to add, so why not).

Reverb & chorus

Reverb and chorus should be very simple to set up for the most common use cases -- we definitely don't want to swamp the user with a myriad of parameters! These effects are off by default, to enable them you just need to set a single parameter in the mixer section:

[mixer]
reverb = on|off|tiny|small|medium|large
chorus = on|off|low|medium|high

on maps to the medium setting for both. The various presets set up the correct reverb and chorus presets, and the per-preset aux send levels that are designed to work well with a large variety of games.

(Later on support could be added for specifying custom reverb and chorus presets, similarly to fluidsynth, but this is low priority.)

Advanced users could always override the aux send levels defined by the presets per sound card section:

[sbtype]
reverb_pcm = 0     # no reverb for the PCM output
reverb_opl = 80    # huge amount of reverb for the OPL output

[innovation]
reverb = 50        # SID music drenched in reverb, yikes!

Implementation

The proposed changes to the mixer architecture would look like this (new paths and signal processors are in green; the aux send lines for the chorus effect are omitted for clarity):

dosbox-audio-improvements

Your first reaction might be that this is a lot of filters. Well, they're quite cheap computationally, and keep in mind that they will never be used all at the same time. Typical scenarios are as follows:

  • MT-32 music + Sound Blaster PCM sound: 1 filter
  • Sound Blaster OPL music + PCM sound : 2 filters
  • Tandy/PC Speaker/Game Blaster/PS1 music: 1 filter
  • Innovation music + PCM sound: 2 filters
  • GUS music + PCM sound: 1 filter

So we're really talking two cheap low-pass filters at most at any given time. Even if someone enables all possible sound options in the config, most sound card implementations enable processing only if there's actual output from the card (and if there are shortcomings in that area, it will be addressed as part of this change).

Crossfeed is practically free (one MUL and ADD per sample), and enabling reverb and chorus is entirely optional (off by default). But they're very CPU efficient: on my i7 4790k I can create dozens of instances of both in REAPER and the total CPU usage is barely 2-3%. In any case, their performance should be comparable to the Fluidsynth reverb and chorus (and they sound way better, IMO).

So how would we fit all this into the existing code?

Without changing the current mixer architecture fundamentally (for questionable benefits, I have to add), the most practical approach is to filter the samples when adding them into the temporary mix buffer in AddSamples. That's exactly how DOSBox-X implements the filtering, look here and here (or just search for lowpass_on_load in mixer.cpp). It's done this way because AddSamples support all sorts of different formats, so you really want to do the filtering after the format normalisation. Plus this way every channel can implement their own completely independent low-pass filtering with different filter parameters -- that's what we want!

Reverb and chorus are a bit trickier. The standard approach in a mixer is to implement these as aux sends, which means a certain percentage of the post-insert effects signal of each channel (post low-pass and crossfeed in our case) is sent into an aux channel, where a single reverb or chorus instance processes the summed signal. The aux send channels' output is then simply mixed back into the master output at the end. This way different amounts of reverb/chorus can be applied per each channel for any number channels while using only a single reverb/chorus instance.

In our architecture we'll need two additional mix buffers to realise this goal: one for the reverb, and one for the chorus. MixerChannel will get two additional parameters too, ReverbSendPercetange and ChorusSendPercentage, then similarly to filtering, AddSamples will accumulate samples scaled by the send percentages in the reverb and chorus mix buffers, respectively, in addition to the main mix buffer. As the last step, reverb and chorus processing will be applied to these two aux buffers, and their contents will be simply added to the main buffer.


That would be all, thanks for listening, looking forward to your comments, and good night! 😀🌙⭐ (Oh, and if it's unclear, I'm intending to do all these changes myself.)

johnnovak avatar May 11 '22 13:05 johnnovak

just wow for your writing time and test all the details - im not against it :)

LowLevelMahn avatar May 11 '22 13:05 LowLevelMahn

Brilliant and thorough proposal, @johnnovak! I'm all for it :rocket:

Adding some thoughts/comments -

Filter library considerations

  • I don't have any suggestions, and defer to your knowledge and ear as to what sounds best.
    • If anyone has more options for consideration, please mention them!
  • We'll probably want to include them in src/libs/*, unless some happen to be available in Debian, Arch, VCPKG, etc.
  • They need to be compatible w/ GPL v2+ (handy diagram).
    • If Staging needs to go to GPL v3+ for a critical package, then we will.
  • Prefer to use them as-is, without the need to modify them:
    • If we do need to layer on changes/fixes, we can do it in separate commits or even make an upstream fork like we've done for Loguru (while PR'ing fixes upstream).
    • Ideally we can get upstream on board like we've done for many packages (for example, Staging has flagged 91 issues in the dr_audio codecs, all now fixed)
  • When adding external code to the repo, use one initial commit per source package with the commit author set to the upsteam maintainer(s). Then subsequent commits should tie the code into Staging (Meson/VS solution), and so on, with yourself as the author. Here are a couple examples of these initial commits:
    • Add reSID sources: https://github.com/dosbox-staging/dosbox-staging/pull/978/commits/8bbe0dad6313b8fda5e37cb31b5850654d99a7a7
    • Add PDcurses sources: https://github.com/dosbox-staging/dosbox-staging/pull/1636/commits/f7502a1027d7431fa98b68876dd25b20c5f95689
  • It's OK to add the minimal sources that we need plus the license and maybe a small text files; (if their repo is full of unnecessary bloat, we don't need to truck all of that in)

Conf file

A nuance of the conf file is that key names can't be duplicated, even when in different sections. The following example won't work:

[sblaster]
filter = true

[gus]
filter = true

This is a drawback, but allows convenient runtime modification of settings using just the keyname, like: C:\> sbtype gb (without having to specify the section).

If we want per-audio-card settings (which I think makes sense), let's go with some kind of "standardized" naming convention that tacks on either a prefix or postfix, so users can extrapolate and "guess right": gus_crossfade, gus_filter_xyz, tandy_crossfade, tandy_filter_xyz, etc..

Implementation

  • The diagram looks great, and I fully agree that adding the processing to the back-end (after sample normalization), like you suggested, makes the most sense.

  • If some of the filters are code-intensive to use (ie: lots of boiler plate setup calls, state variables, intermediate calls), it might make sense to wrap them in a class to isolate their code and keep the footprint in the mixer minimal. I've added a couple small audio tools as classes (DC bias silencer, soft limiter, and enveloper) as classes, which keeps their footprint inside the audio devices code to basically just a couple lines; feel free to use them as a template (if they help!)

Can't think of any more, but will add as we go. Thanks for this proposal @johnnovak ; my ears can't wait!

kcgen avatar May 11 '22 18:05 kcgen

Very exciting proposal! I'm all for it. Sound is a big part of the experience and i'm always in favor of emulating the actual hardware more accurately.

Thanks for the writeup alone, great work!

Burrito78 avatar May 11 '22 20:05 Burrito78

Damn. Lots of Latin letters.

Currently, the analog output filters of Sound Blaster cards are not emulated.

I recall seeing a post on Vogons that mentions at least one of the boards' filtering is sorta-kinda implemented.

What about resampling? And do you plan on anyhow touching the mixer, or that's completely out of scope?

Overall, thou hast my blessing. Can't have too good audio.

Likely related: #278. The saga continues...

GranMinigun avatar May 11 '22 23:05 GranMinigun

Thanks @kcgen for the encouraging words and the tips, they're super useful! Some of that advice could be incorporated into a general FAQ if that hasn't already happened (I'm thinking of the advice about introducing libraries and licensing in particular).

Damn. Lots of Latin letters.

That's the only alphabet I know! 😝 But there's an image and lots of audio files too! 😁

I recall seeing a post on Vogons that mentions at least one of the boards' filtering is sorta-kinda implemented. What about resampling? And do you plan on anyhow touching the mixer, or that's completely out of scope?

I think you're referring to James-F discovering that linear-interpolation resampling results in a kind of low-pass filtering, which is true, indeed. It is, however, inaccurate. The GUS had support for hardware-based linear interpolation of the samples via the GF1 chip (this was exploited by trackers, and this was the #1 reason why it was the favourite card of demosceners like myself), so the linear interpolation used in the GUS emulation is actually pretty accurate, but it's not for the SB. To be accurate, you'd need to emulate the raw output of the DAC and the effects of the antialiasing low-pass filter.

You're absolute correct, it will never be perfect without doing the DAC emulation and resampling correctly. But I left that out of my proposal on purpose, as that requires further research and could be more complicated to implement. However, the addition of the filters on top of whatever resampling method we have now would get us in the ballpark and give about 80-90% accuracy (I've tried it by comparing my results with real recordings, and it's pretty close, but not 100%).

And do you plan on anyhow touching the mixer, or that's completely out of scope?

Only to the extent it is necessary, like I explained in the last implementation section (so not much, I think).

Overall, thou hast my blessing. Can't have too good audio.

🖖🏻 👍🏻 🎧 ❤️

Likely related: #278. The saga continues...

I did read that a while ago, but thanks for reminding, I'll peruse it again. There's lots of good info in there.

@kcgen are you happy if I go ahead a create a few tickets for myself, a breakdown of the main things that need to get done to make this happen? I think we all agree it shouldn't go in as a single massive PR. 😁

johnnovak avatar May 12 '22 02:05 johnnovak

However, the addition of the filters on top of whatever resampling method we have now would get us in the ballpark and give about 80-90% accuracy (I've tried it by comparing my results with real recordings, and it's pretty close, but not 100%).

Staging's philosophy is that we try to "do right" by the source material and the original authors, above all else.

Here's an example: some soundcards had low quality circuitry which introduced a constant hiss into the audio output. If someone authored a hiss-PR, Staging approach would say that the hiss harms the authors material, and we would resonably make the case that the original composers would agree too -- so we would avoid / eliminate that, even if it means upsetting hardware-accuracy purists.

But if a feature gets closer to what the author intended (or heck, even enhances it a bit.. given we're in the digital domain), then it's all welcome!

@kcgen are you happy if I go ahead a create a few tickets for myself, a breakdown of the main things that need to get done to make this happen? I think we all agree it shouldn't go in as a single massive PR. 😁

Yes to all 👍 !

kcgen avatar May 12 '22 02:05 kcgen

Here's an example: some soundcards had low quality circuitry which introduced a constant hiss into the audio output. If someone authored a hiss-PR, Staging approach would say that the hiss harms the authors material, and we would resonably make the case that the original composers would agree too -- so we would avoid / eliminate that, even if it means upsetting hardware-accuracy purists.

That makes sense, we don't want to emulate the HDD deterioriating over time and developing a few random bad sectors either for a more "authentic" experience 😛

I'd say with the filter implementation I proposed we'd definitely get very very close (almost indistinguishable) to the author's intentions. The DAC emulation is getting a bit into the "purist" category, and I'm not even sure it's that much worth it. But it's out of scope for what I had in mind, anyway.

I think we're on the same page 😄

johnnovak avatar May 12 '22 03:05 johnnovak

By the way, here's a video that demonstrates adding reverb and chorus to OPL music through the AWE64 DSP on real hardware.

https://www.youtube.com/watch?v=6oSnhqMsmWM

johnnovak avatar May 12 '22 03:05 johnnovak

First I've heard those added to DOS synth music; and wow! The combination pops the music from its simple "2D" form into a much more complex and realistic sounding form. Thanks for sharing that!

kcgen avatar May 12 '22 04:05 kcgen

First I've heard those added to DOS synth music; and wow! The combination pops the music from its simple "2D" form into a much more complex and realistic sounding form. Thanks for sharing that!

Then I recommend listening to my "mockups" of how the proposed features would sound like in action in actual games (the files ending with -fx), if you haven't done so already. I think you'll like them 😄

https://drive.google.com/drive/u/1/folders/1APrmVKSQscjbjP01eCcxllKAOwxLEDGz

Like I explained, these are using the exact same reverb and chorus I'm intending to introduce.

johnnovak avatar May 12 '22 04:05 johnnovak

Then I recommend listening to my "mockups" of how the proposed features would sound like in action in actual games (the files ending with -fx), if you haven't done so already. I think you'll like them

-fx sounds fantastic!

kcgen avatar May 12 '22 07:05 kcgen

Regarding lower quality DACs, DOSBox-X emulates that to some extent with both a lowpass filter and modified resampling that also has a "slew rate". If the slew rate is higher than the source rate, then it's like resampling but with the interpolated segment shorter than one sample. If the slew rate is lower than the source rate, then the effect is like a DAC that generally follows the waveform but can't quite follow sharp changes.

joncampbell123 avatar May 13 '22 03:05 joncampbell123

I implemented the lowpass filters in DOSBox-X in a way that should be easy to copy into staging and incorporate into the mixer. They're simple RC constant types, but there is also support for multiple passes which are used to emulate for example the programmable lowpass filter on ESS AudioDrive 688 chipsets.

joncampbell123 avatar May 13 '22 03:05 joncampbell123

The Sound Blaster (and Pro) emulation uses the slew rate mechanism to emulate what, according to my tests with real hardware, is effectively a DAC with no filtering whatsoever, which is why lower sample rates have that classic metallic sound to them. The slew rate in that case is always some fixed rate like 10-20KHz regardless of source rate. The Sound Blaster Pro lowpass filter (which knowing programs bypass with a mixer bit) is also emulated using slew rate and lowpass.

joncampbell123 avatar May 13 '22 03:05 joncampbell123

Gravis Ultrasound seems to be another case of no DAC filtering, only a lowpass at 20KHz (because audio amplifiers), but you don't really notice because the GF1 DAC runs at a fixed rate somewhere between 14KHz and 44.1KHz depending on number of GUS voices active.

joncampbell123 avatar May 13 '22 03:05 joncampbell123

Hi @joncampbell123, thanks for stopping by. Yes I've had a look at that code, and filtering "on load" in AddSamples is a good idea, I'll copy that approach. Other things, however, left me scratching my head:

  • the SB16 definitely employs brickwall limiting at half the sample rate, that's well-documented; you're using a simple RC-filter (6dB/oct I assume), that won't sound the same at all (EDIT: 48dB/oct Butterworth actually sounds very close to real recordings)
  • the SB Pro 1/2 filter frequency is off (it is 2nd-order 12dB/oct Butterworth filter @ 3.2k, people have measured it; you're doing doing two RC-filter passes at 3.8k instead)
  • I don't get what the point of the 12dB/oct filter at 23k is, that won't do anything useful at all
  • equally unsure about the slew rate stuff; I don't think that emulates any physical phenomenom that's actually happening, and I don't think you need it if you get the antialiasing/reconstruction filters right

I've checked the PCem sources too; they got the SB16 filter right, but other things are wrong 🤷🏻 Plus I think they don't even recalculate the coefficients for different host sample rates, they just assume 44.1k or 48k or whatever -- that's massively wrong...

I really don't want to do any DSP using fixed-point arithmetics either, I don't even think it's faster on current processors, and you'd get all sorts of precision problems (made worse by cascaded filtering), it's just not worth it at all these days. I'm going to introduce an optimised DSP library, most likely KFR. The reverb and chorus (and most audio algorithms) needs floats too, so I'll need to convert the data to floats once anyway.

For the filtering, applying the correct IIR filters on the raw non-interpolated output (nearest-neighbour) does the trick for me, I've compared it against real recordings and it's pretty close. Linear interpolation resampling introduces additional low-pass filtering, effectively, so applying the correct filters would sound overly filtered (I suspect that's why you're using different filter parameters in your code because your slew rate stuff introduced similar things and you tried to balance it out by ear).

~~The output of the SB1.0 is unfiltered, that gives it it's "metallic junk" sound at 5-11k sample rates, we are in agreement on that. However, the SB2.0 seems to employ some filtering, although it's not easy to find information about that. In any case, getting the SB16 and SBPro is the most important, that's what most people use.~~

EDIT: Just figured it out, SB2.0 is very likely 6dB/oct @ 3.2k. It still sounds pretty "metallic". Still a bit unsure about SB1.0, but that will do for it as well for now. But most definitely they used some filter, in the schematics of some SB 1.0 clones that are supposed to be 100% the same there definitely are some RC filters right after the DAC, otherwise it would really sound like absolute crap...

johnnovak avatar May 13 '22 11:05 johnnovak

Actually, found the concrete measurements (source). SB16 is at 11kHz sample rate, the other two have fixed filters.

image

johnnovak avatar May 13 '22 12:05 johnnovak

@johnnovak Actually I'm happy to see hard numbers. Most of what was done so far is "by ear" compared to real hardware. I only know how to do RC lowpass filters, I'm not certain how to do other types.

joncampbell123 avatar May 14 '22 03:05 joncampbell123

@johnnovak Actually I'm happy to see hard numbers. Most of what was done so far is "by ear" compared to real hardware. I only know how to do RC lowpass filters, I'm not certain how to do other types.

No problem, I'm not exactly a DSP guru either. But I've studied signal processing theory at uni and I've messed around with DSP stuff just enough to be dangerous 😄 My preferred approach is to start from actual measurements and try to reproduce the results correctly as much as possible (taking performance into account), then tweak by ear, if at all necessary.

In case of the SB measurements linked above, I don't find extra tweaking necessary at all, if we do the resampling correctly. The current resampler uses linear interpolation & fixed-precision, which effectively acts as really bad lowpass filter (assuming 22k -> 44.1/48k conversion typical for the SB). That's like losing the game from the start — we can't apply the measured filter response on top of a shaky foundation like that and expect correct results!

Nevertheless, I've tried it, and the results are way off (overly filtered, as expected). But if I do the resampling correctly, or if I simply just render the audio from DOSBox at 22k sample rate without resampling, my results match the real recordings very well! The "metallic crunchiness" you mentioned is fully there; you really need correct resampling and correct filters to bring that out in its "full glory" 😁

So now I'm looking into efficient real-time resamplers that give acceptable results without too much latency (preferably zero or just a few samples). Well, shouldn't be too hard, because pretty much every other method is better than linear interpolation...

More details about the frequency response of linear interpolation: https://dsp.stackexchange.com/questions/42757/effects-of-linear-interpolation-of-a-time-series-on-its-frequency-spectrum https://www.dsprelated.com/freebooks/pasp/Linear_Interpolation_Resampling.html

johnnovak avatar May 14 '22 04:05 johnnovak

Even better yet, an AWE32 or AWE64 that allows you to route the OPL output through the onboard EMU-8000 DSP (on certain revisions), which lets you add reverb and chorus to the OPL music!

Having customizable reverb and chorus levels might let us approximate the post-processing provided by the Adlib Gold's add-on surround module that owners could buy:

Surround Module

Credit: Bratgoul at English Wikipedia, CC BY-SA 3.0 Wikimedia Commons

Dune without the surround module:

https://user-images.githubusercontent.com/1557255/168489593-3b994e10-0629-4e1b-954c-6474684deea5.mp4

Dune with the module:

https://user-images.githubusercontent.com/1557255/168489594-56c02c48-7304-44b1-9a67-f178605439c1.mp4

Recordings credit: JimWest 2020-05-21, 13:53, Creative Commons Attribution 4.0 International


Related discussion from the DOSBox crew: https://www.vogons.org/viewtopic.php?f=31&t=31494

ripsaw8080 notes:

I think you're better off not trying to simulate the effects (like chorus and reverb), or go easy with them, because they're adjusted on the fly with the AdLib Gold, not just cranked up over everything; and the "surround sound" daughterboard was optional.

One option would be to add parsing of these effect levels in the OPL3's IO data/control ports, and adjust the effect filters on the fly (either to the entire stream or to each of the OPL3's 18 channels separately), if that's the case.

kcgen avatar May 15 '22 19:05 kcgen

One option would be to add parsing of these effect levels in the OPL3's IO data/control ports, and adjust the effect filters on the fly (either to the entire stream or to each of the OPL3's 18 channels separately), if that's the case.

Great idea @kcgen!

One option would be to add parsing of these effect levels in the OPL3's IO data/control ports, and adjust the effect filters on the fly (either to the entire stream or to each of the OPL3's 18 channels separately), if that's the case.

It definitely has to be done that way like you said, the music seems to adjust the effects setting at least per song (maybe even during a song or per channel, if that's technically possible).

I've been looking into this earlier, and it turns out that the AdLib Gold emulation in PCem includes the emulation of the YM2128 surrond processor as well which was the DSP used on the surround add-on module.

There's another library that aims to emulate YM7128, and just by looking at the code it seems to be a better effort then what's in PCem (at least it's much better documented, and it provides different engines: integer, float, "ideal", etc.)

https://github.com/TexZK/YM7128B_emu

If we spend effort doing this, I think my preference would be to use the YM7128 library for accuracy. Those old DSP effect processors used some funky integer-only maths which resulted in lots of weird noise and precision errors that gave those units their characteristic sound (e.g. old Lexicon studio reverbs). I'm assuming it's about the same level of effort anyway, or maybe even less, because we won't need to try to match and tune the reverb and chorus settings.

Here's the full AdLib Gold Dune soundtrack for reference.

https://www.youtube.com/watch?v=gUfGyfbzl9k

johnnovak avatar May 16 '22 00:05 johnnovak

Nice finds!

I was reading vgmpf's page about the Adlib Gold that mentions the card hasn't been emulated; so it's great to see that Sarah added support for it in PCem :+1:

That YM7128B_emu library is top notch.

If we spend effort doing this, I think my preference would be to use the YM7128 library for accuracy.

Fully agree; moving out of scope into separate issue: :arrow_down:

kcgen avatar May 16 '22 01:05 kcgen