obs-studio
obs-studio copied to clipboard
libobs: Prevent audio doubling
Description
If a source has output & monitor enabled and there is an audio output source active that has the same audio output device as the monitor, the audio would be doubled in the final output.
Motivation and Context
Suggestion from @Warchamp7, as he says this should be implemented before #1253 is merged.
How Has This Been Tested?
Created audio source and set the monitoring to output and monitor. Then added a audio output source the used the same device as the monitor. Then I recorded a video to make sure there was no audio doubling.
Types of changes
- Bug fix (non-breaking change which fixes an issue)
Checklist:
- [x] My code has been run through clang-format.
- [x] I have read the contributing document.
- [x] My code is not on the master branch.
- [x] The code has been tested.
- [x] All commit messages are properly formatted and commits squashed where appropriate.
- [x] I have included updates to all appropriate documentation.
As far as I understand this change, it will effectively ignore the entire source if it taps into the current monitoring output source?
I have some UX concerns around that: If I add an audio output capture source to my scene setup, I would expect it to be captured. OBS just internally deciding not to capture it (because I have the same device set as my monitoring device) is confusing because there is nothing in the program that would suggest to me that the source is ignored.
It should either prevent me from adding the source in the first place or should be added in a disabled state ("Cannot add the capture source because the device is used for monitoring").
And the other way around selecting a monitoring device that is currently used as an audio capture device should be blocked or marked with a warning.
In the end "audio doubling" is the effect of explicit user choices and not something we should magically fixed.
@Warchamp7 WDYT?
PS: This will also be to limited effect on macOS because we don't capture audio from audio devices anymore, but instead capture the entire system output or specific application's output (it's device-agnostic), so this check (and the same check in the CoreAudio-based monitoring implementation) will never match the monitored source.
It should only be ignoring a source if monitoring is enabled, and the source selected for monitoring output is also being captured by OBS. To an end user, there should not be a situation in which this makes a meaningful difference, or where you would actively want the audio to be doubled in this way. I am unable to review the code to see if that's exactly what it's doing, but that's what the end result should be.
It should only be ignoring a source if monitoring is enabled, and the source selected for monitoring output is also being captured by OBS. To an end user, there should not be a situation in which this makes a meaningful difference, or where you would actively want the audio to be doubled in this way. I am unable to review the code to see if that's exactly what it's doing, but that's what the end result should be.
That's a choice we can make (monitoring trumps scene setup), but nothing in the app signals that. We should set a "muted" or "disabled" state on the source to communicate its state. Otherwise we have an internal state of source ("ignored") that is hidden from the user.
The true UX problem, is OBS's users abuse Monitor functionality for something it was never intended for.
When monitoring in OBS, there are 3 main use-cases.
- Permanently monitoring a source, because it's something OBS is making noise in that I can't hear, because it's running direct to output, and I need to hear it naturally.
- Does X sound good? (true monitoring in broadcast terms) experienced broadcast users.
- What does monitor mean?
Usecase 1:
People who are choosing to monitor to an output device that is being captured by OBS, are likely doing so in error, as they can't hear the OBS output, so don't realize it's double broadcasting.
If they have chosen monitor and output, then the likely intent is that they want to be able to hear the source, and output it, even though it would already be captured through the monitor device's source, so muting the capture output would seem appropriate.
Usecase 2:
People who are choosing to monitor to an output device that is being captured by OBS, likely have the output device being captured as a separate track as a backup in case of audio problems, and I don't like the idea of 'magic' happening that could interfere with people's intended captures. Adding to the fact that Monitoring could sound different then the final output, depending on what stage of the audio mixing is being monitored (with hopeful PR's down the track to add mixing buses and master mixes) In this situation, they would likely want the capture to win, as they wouldn't want to compromise their recording for the sake of monitoring.
People intentionally monitoring something that is being captured already, are likely trying to either isolate the source to find issues or sound check, or hear things before later mixing is taking place.
This user would be mad if their multi-track recordings suddenly didn't have sound on 1 track because they were monitoring something, so it MUST be the monitor that is rejected.
Usecase 3:
People unintentionally monitoring a source to the same output are probably trying to learn what the word 'monitor' means in OBS. It would make sense to educate the user in this case.
TLDR; there's good reasons to support either argument, so I propose that OBS simply doesn't fix this bug, but instead educates the user subtly if it's being done from the UI.
I'd propose that the UX that should happen instead, (and it's a big change) is that when adding a new audio source, when going to properties, that instead of going to advanced audio properties, where you are presented with a large intimidating mixing table, you can select multiple output devices for a given source.
These output devices would be a combination of virtual output devices (tracks/master mixes) and physical output devices (and in the future, submixes).
By default,
- microphones and recording devices would output to Track 1.
- Line-in devices would default to Track 1 and the default Audio Output Device (if not captured).
- Output devices would default to Track 1.
- Browser sources, media sources, etc, would default to Track 1 and the default Audio Output Device (if not captured).
None of this would be considered monitoring, and if selecting a device that is being otherwise captured, a conflict warning would appear showing which devices are being double broadcast, or either of the audio paths could be picked as long as the user is notified.
In broadcast terms, OBS currently conflates physical output devices as submixes.
Doing this would reveal the intent behind what the users wants when adding various audio sources.
Monitoring would then be finally free to be what it is supposed to be, a broadcast-level monitoring function that allows you to soundcheck and isolate audio sources. (not necessarily solo) and the 'useless' monitor only option can be got rid of, as you could output the audio source to the monitor device if that's what you wish.
It should only be ignoring a source if monitoring is enabled, and the source selected for monitoring output is also being captured by OBS. To an end user, there should not be a situation in which this makes a meaningful difference, or where you would actively want the audio to be doubled in this way. I am unable to review the code to see if that's exactly what it's doing, but that's what the end result should be.
That's a choice we can make (monitoring trumps scene setup), but nothing in the app signals that. We should set a "muted" or "disabled" state on the source to communicate its state. Otherwise we have an internal state of source ("ignored") that is hidden from the user.
The intended goal is that we assume if a user adds a source that produces audio, they only want that audio in the output once.
Example: A user has their Desktop Audio device in the output and they add a media source. By default that media source only goes to the output. As a result, the user cannot hear it.
Now the user has their monitoring device set to their Desktop Audio device. If the user sets the media source to monitor and output, it will now be in the final output mix twice. Once from directly going to the output, and secondly from playing out of the monitoring device, which is then itself going to the output.
Most often the goal of a user in this scenario is to be able to hear a source themself and also output it to their stream or recording.
If they are trying to do true audio monitoring, it will be to another device that is most likely not included in the output mix.
There should never be a scenario where this causes a sources audio to actually disappear. It should only be preventing it being in the output twice.
The weird case we CAN end up with is when a user mutes a source that is also monitored to a device that is still outputting. In this case, the source would still be audible in the output. There was a PR somewhere that added an indicator icon for communicating this case.
Ultimately, the intent is for users to be able to say
- "I want this audio to go to stream" (Mute/Unmute toggle)
- "I want to be able to hear this audio myself" (Monitor toggle)
- without being able to footgun themselves into the audio being in the output twice.
In theory one should not be possible to add a source for the device selected as monitoring nor should one be able to select a device that is being captured as a monitoring device.
If this is a bridge too far for some reason, then I agree that if the monitoring device is also added as a source to a scene, then all monitoring for all sources should be disabled to avoid the audio loop effect.
It should only be ignoring a source if monitoring is enabled, and the source selected for monitoring output is also being captured by OBS. To an end user, there should not be a situation in which this makes a meaningful difference, or where you would actively want the audio to be doubled in this way. I am unable to review the code to see if that's exactly what it's doing, but that's what the end result should be.
That's a choice we can make (monitoring trumps scene setup), but nothing in the app signals that. We should set a "muted" or "disabled" state on the source to communicate its state. Otherwise we have an internal state of source ("ignored") that is hidden from the user.
The intended goal is that we assume if a user adds a source that produces audio, they only want that audio in the output once.
Example: A user has their Desktop Audio device in the output and they add a media source. By default that media source only goes to the output. As a result, the user cannot hear it.
Now the user has their monitoring device set to their Desktop Audio device. If the user sets the media source to monitor and output, it will now be in the final output mix twice. Once from directly going to the output, and secondly from playing out of the monitoring device, which is then itself going to the output.
Most often the goal of a user in this scenario is to be able to hear a source themself and also output it to their stream or recording.
If they are trying to do true audio monitoring, it will be to another device that is most likely not included in the output mix.
There should never be a scenario where this causes a sources audio to actually disappear. It should only be preventing it being in the output twice.
The weird case we CAN end up with is when a user mutes a source that is also monitored to a device that is still outputting. In this case, the source would still be audible in the output. There was a PR somewhere that added an indicator icon for communicating this case.
Ultimately, the intent is for users to be able to say
- "I want this audio to go to stream" (Mute/Unmute toggle)
- "I want to be able to hear this audio myself" (Monitor toggle)
- without being able to footgun themselves into the audio being in the output twice.
I can respect this, however it needs to work reliably in multi-track recordings. e.g. still output to track 1, even if track 2 was fixed because of doubling.