duality
duality copied to clipboard
Support Multiple SoundListeners at Once
Summary
With the planned support for Camera-local viewports, split-screen setups would become easy to do - except for audio, which is constrained to only one listener. This should be addressed.
Analysis
- The OpenAL backend as well as, potentially, other audio backends is limited to only a single listener.
- In order to make multiple listeners work, the Duality audio system could re-map all active audio sources to the single internal backend listener.
- A trivial implementation would assign all sources to the nearest / loudest listener and express them relative to it.
- Avoid hard changes in audio parameters when switching listeners. Fade from one to the next.
- Avoid extensive listener switches. Apply a threshold that needs to be crossed for a switch to occur.
Moving this to the General
milestone, as it doesn't require breaking changes and is in fact a feature that could be added at any point.
Note that this is somewhat related to the use cases outlined in issue #504: If there were multiple scenes active, audio and listeners will leak between scenes, since all audio exists in one shared context. Properly addressing this might require more than just multiple listeners though, but rather multiple "audio worlds" as well.
Very low priority for now, but keep in backlog for long term concepts.
The main limitation here seems to be OpenAL, which does not support multiple listeners. The proposed workaround, if I understand the part "re-map all active audio source" correctly, is to virtualize multiple audio listeners by spawning the same source with faked parameters for each additional (virtual) audio listener. While this is a smart idea, the approach will have its downsides. As each source has "multiplied by NumberOfVirtualAudioListeners" instances, this will create a lot of processing. In some cases, this can be extremely high esp. if #636 audio effects get implemented and spawned per source instance.
Therefore, the so called "trivial implementation" seems to be the appropriate way to do. Here are some thoughts around that concept:
- One sound source has only one sound source instance (including effects) no matter how many audio listeners there are.
- Each listener has something like a contribution factor. This can be used for a lot of things. E.g., set the factor of 4 listeners to the max for a 4-player-splitscreen game. Or "fade" the contribution factors of two listeners inversly to create a transition between two different perspectives.
- Any realtime-parameter-based modulation like distance attenuation, Doppler or related API calls (e.g. GetDistanceToListener applied to arbitrary audio source parameters) utilizes an averaged value of the available audio listeners. In case of the 4-player-splitscreen game (contribution factor is at max for every listener), it'd make sense to just use the closest listener and drop the averaging altogether.
- Positioning also can be an average of the available listeners within the sources max distance. The averaging should consider the listeners contribution factor as well as how close the listeners are to the source.
Avoid hard changes in audio parameters when switching listeners. Fade from one to the next.
That can be artistically limiting. Think of jump cuts and so on. Maybe, the API call can be overloaded with an optional fade bool and fade time, calling a method that automatically manipulates the contribution factors from point 2, above.
Avoid extensive listener switches. Apply a threshold that needs to be crossed for a switch to occur.
This is can also be unnecessarily limiting. Continuing the idea from above, the fade process could be stoppable any time and start a new transition if a new switch occurred.
Great writeup 👍
Avoid hard changes in audio parameters when switching listeners. Fade from one to the next.
That can be artistically limiting. Think of jump cuts and so on. Maybe, the API call can be overloaded with an optional fade bool and fade time, calling a method that automatically manipulates the contribution factors from point 2, above.
Avoid extensive listener switches. Apply a threshold that needs to be crossed for a switch to occur.
This is can also be unnecessarily limiting. Continuing the idea from above, the fade process could be stoppable any time and start a new transition if a new switch occurred.
Ah, those were both meant as implementation details of how the virtualized sources approach we're sketching out would behave in terms of switching which listener is used for each source. So "avoid hard changes" is not when disabling one listener and enabling another by the user, but when there are two active listeners and the audio system internally decides to use one or the other for a source in the proximity of both.
Each listener has something like a contribution factor. This can be used for a lot of things. E.g., set the factor of 4 listeners to the max for a 4-player-splitscreen game. Or "fade" the contribution factors of two listeners inversly to create a transition between two different perspectives.
Good point, we actually have more than the one originally described use case here, and a "multiple listeners" feature would be a really good starting point to be generalized into a "(multi-)listener contribution" scenario like you described.