mediacapture-main icon indicating copy to clipboard operation
mediacapture-main copied to clipboard

what is the default channelCount

Open fippo opened this issue 4 years ago • 44 comments

https://w3c.github.io/mediacapture-main/#def-constraint-channelCount doesn't say which is the default.

Chrome recently changed from 1 to 2 as we noticed in https://github.com/w3c/webrtc-extensions/issues/63# fiddle: https://jsfiddle.net/fippo/c0ax4tv1/1/ (note: might require a stereo-capable mic; macbook mics are not)

fippo avatar Feb 24 '21 12:02 fippo

The recent change was noted by @jan-ivar:

To test Chrome, I used an in-content device picker to pick my BRIO: I get channelCount: 1 with {audio: true} in M88, but channelCount: 2 in M90. Did the default change recently?

henbos avatar Feb 25 '21 09:02 henbos

Per summary over at https://github.com/w3c/webrtc-extensions/issues/63#issuecomment-786118846, would it make sense to mandate default channel count or is this something that should be up to the browser?

@guidou what do you think about default channel count = 1 in Chrome?

henbos avatar Feb 26 '21 08:02 henbos

@guidou says the default likely changed here: https://chromium-review.googlesource.com/c/chromium/src/+/2593122

henbos avatar Feb 26 '21 08:02 henbos

I am in favour of making browsers as consistent as possible in the way tracks are initialised, past the point devices are selected. That includes channel count, width, height... It is probably painful for web developers and error prone since testing might only happen in one browser in practice.

In general, it seems good to document the default behavior, especially if it is consistent amongst browsers. If the defaults have to change in the future, it might be best to document the change and coordinate within implementations.

youennf avatar Feb 26 '21 09:02 youennf

I think we should in as many cases as possible specify default values for all constrainable properties a criteria to break ties when multiple configurations have the same fitness. This would make all implementations work in a more predictable way, especially in the common cases where few or no constraints are passed to gUM.

guidou avatar Feb 26 '21 09:02 guidou

Agreed. In Safari, we are artificially adding ideal width/height/frame rate constraints (640/480/30) if no corresponding constraint is given.

youennf avatar Feb 26 '21 10:02 youennf

Chromium does something very similar to that as well for those 3 properties.

guidou avatar Feb 26 '21 10:02 guidou

Could we do the same for channelCount without having to revert the culprit CL?

henbos avatar Feb 26 '21 10:02 henbos

This would make all implementations work in a more predictable way, especially in the common cases where few or no constraints are passed to gUM

I'm sympathetic to web compat concerns, which we experience as well. However, I'm not sure this would be better for users. Constraints were purposely designed to balance control between two stakeholders: the user and the application, perhaps best illustrated by their two extremes:

  1. The user configures the camera and settings they want in their OS or browser, perhaps with per-site exceptions.
  2. Each application configures what camera and settings to use

1 is an unconstrained track. 2 is a 100% constrained track. This is why constraints were purposely designed to not specify defaults, and why constraints are distinct from settings in the API in the first place.

User agents conspiring on a fixed set of default settings forever for all devices in the interest of web compat for apps, would wreck 1.

For example: If a user inserts a stereo microphone they just bought, why shouldn't they get stereo on every site out there? One of the advantages of the native WebRTC stack is that this can just work without requiring each app to support it.

Browsers may not be doing much with their defaults atm, but defaulting to 640x480x30 mono forever doesn't seem very forward looking. I don't think we should lock ourselves down there.

jan-ivar avatar Feb 27 '21 16:02 jan-ivar

  1. The user configures the camera and settings they want in their OS or browser, perhaps with per-site exceptions.

I know users can change default devices from OS UI. Is Firefox (or any browser) providing such camera configuration UI?

If that is important, we can try to find wording that would still allow per-site default exceptions.

If a user inserts a stereo microphone they just bought, why shouldn't they get stereo on every site out there?

We could define a default rule so that, post device selection, channelCount would be set to 2 if feasible. I do not see how this use case describes the benefit of the current approach, especially if one browser would use stereo if available and another would stick to mono.

One of the advantages of the native WebRTC stack is that this can just work without requiring each app to support it.

I am not sure what 'just work' means. Some native clients do not support stereo and may break if the default suddenly changes. Some web clients will not benefit from stereo and will experience suboptimal experience (due to bandwidth increase).

Browsers may not be doing much with their defaults atm, but defaulting to 640x480x30 mono forever doesn't seem very forward looking.

The idea is not to stick to 640x480x30 + mono forever, web specs do change everyday. The idea is to document these defaults, if we can agree on these defaults. Then, to change these defaults progressively and consistently.

I do not know what default rules Firefox is using. If it does something similar (or is willing to do something similar) to Chrome and Safari, why not documenting it?

youennf avatar Feb 27 '21 18:02 youennf

I know users can change default devices from OS UI. Is Firefox (or any browser) providing such camera configuration UI?

Well, we have the camera and microphone picker in the Firefox permission prompt. 🙂 But even without that, all major browsers respect OS defaults, which are often user configurable. E.g. going to Audio MIDI Setup and choosing my Logitech BRIO makes it the default microphone on macOS and Firefox, and now {audio: true} gives you channelCount 2 instead of 1.

I also could have sworn an earlier version of Audio MIDI Setup let me change the 2 ch to 1 ch, but I might be misremembering there, or it did with a different device I no longer own. I believe other OSes allow this though.

This means there is no one default channelCount. That's what we should document. If your app assumes there is, then your app is broken. This is why we have constraints and not a simpler settings API: if an app requires mono to not break, constrain it to mono. The model is: if you care about it, constrain it. Otherwise you get what you get.

jan-ivar avatar Feb 27 '21 20:02 jan-ivar

In theory, reading the spec should be sufficient to implement it and get interoperability with other implementations. I do not think the spec is there yet, implementors have to study what other browsers are doing to actually get to that point :(

As an example, the spec does not say whether, if echoCancellation is supported, echoCancellation should be on or off. I would guess browsers turn echo cancellation on by default, and many applications are relying on it.

I generally disagree with "If your app assumes there is, then your app is broken", good and reliable defaults are extremely important.

youennf avatar Feb 28 '21 17:02 youennf

E.g. going to Audio MIDI Setup and choosing my Logitech BRIO makes it the default microphone on macOS and Firefox, and now {audio: true} gives you channelCount 2 instead of 1.

Very interesting. Thanks for checking.

I think the user specifying default device makes sense because you don't want to default to that old camera or microphone that is collecting dust behind your computer, you want your shiny new device that is in your face.

However when it comes down to what "technical details" to open a particular device in, like which resolution or number of channels to use when recording, I'm not sure I see the value in letting the user/OS override what the browser does by default. The browser and/or application should know better than a normal user, especially when the application will in most use cases be streaming the content to a VC server.

This means there is no one default channelCount. That's what we should document. If your app assumes there is, then your app is broken.

If the concern is that deciding defaults now is not forward-looking, we can always revisit what the defaults should be later. At the end of the day, if an implementation changes their defaults from one version to the next that is a change in behavior, whether or not there's a spec change behind it.

Jan-Ivar, you've said in the past that "predictability trumps usefulness", but in this case it seems that unspecified defaults is neither useful or predictable.

Or am I missing something, what is the usefulness in not knowing what you get?

henbos avatar Mar 01 '21 08:03 henbos

For what it's worth, the Chromium's change in default channel count was accidental and there is a revert in progress. This will make implementing https://github.com/w3c/webrtc-extensions/issues/63 a smoother transition, whether or not we can agree on a standardized default channel count here.

henbos avatar Mar 01 '21 09:03 henbos

I don't see any contradiction in the spec having defaults and the user being able to override those defaults. The spec already says "User Agents are encouraged to default to using the user's primary or system default device for kind (when possible).", so it is already encouraging a default for the deviceId property.

To define defaults, I think we should look at what browsers are currently doing and, where there is coincidence or near-coincidence, adopt those defaults in the spec. Also, defaults don't always need to be hard-coded constants. For deviceId we're using a system-defined constant.

That said, even with defaults we will probably not achieve full compatibility across browsers since properties are correlated and different implementations have different capabilities. Still, if we can significantly improve it for the most common cases, I think that would be beneficial.

Experience shows that real-world applications rely on browser defaults and that is not going to change just because we emphasize in the spec that they shouldn't. This channelCount issue in Chrome is a good example. Chrome didn't really have a universal default as channelCount is correlated with other properties. If echoCancellation, noiseSuppression or autoGainControl was enabled, channelCount was always 1 because Chrome's audio processing implementation only supported mono output. When no processing was used, the channel count was the same as the hardware configuration. When Chrome upgraded its processing implementation to support more channels, the default for stereo microphones automatically changed to 2 in all cases and that automatically caused regressions for some applications relying on output always being mono.

guidou avatar Mar 01 '21 11:03 guidou

So, to summarize, I agree with @youennf that "good and reliable defaults are extremely important" because experience has shown many times that when defaults change applications break, even if the default change was 100% spec compliant.

guidou avatar Mar 01 '21 11:03 guidou

To define defaults, I think we should look at what browsers are currently doing and, where there is coincidence or near-coincidence, adopt those defaults in the spec.

I'll file an issue specifically for this. It seems channelCount differs between browsers, we can keep this issue to see whether we can reach consensus.

youennf avatar Mar 01 '21 12:03 youennf

I don't see any contradiction in the spec having defaults and the user being able to override those defaults. ... experience has shown many times that when defaults change applications break, even if the default change was 100% spec compliant.

@guidou There's a contradiction right there: Applications that brittle won't work for users who override those defaults.

jan-ivar avatar Mar 01 '21 16:03 jan-ivar

Why do users have to be able to override defaults?

henbos avatar Mar 01 '21 16:03 henbos

(I mean, other than "which device")

henbos avatar Mar 01 '21 16:03 henbos

Why do users have to be able to override defaults? (I mean, other than "which device")

@henbos There's another contradiction: defaults may be device dependent, like channelCount in Firefox.

What's the point of constraints if the defaults are known?

jan-ivar avatar Mar 01 '21 16:03 jan-ivar

The browser and/or application should know better than a normal user,

Not to pile on the contradictions, but: applications should know better than the user, yet somehow don't know to constrain the settings they rely on to not break miserably?

jan-ivar avatar Mar 01 '21 17:03 jan-ivar

@henbos There's another contradiction: defaults may be device dependent, like channelCount in Firefox.

Do you know what Firefox is doing here? Is Firefox deciding to use channelCount = 2 if device allows it, like {channelCount: 2}? Or is it that Firefox is using whatever the OS default os, like {channelCount: 'default'}? These two options can be specced as well as {channelCount:1}.

As I said, I do not think the spec is precise enough for implementors to be able to interop with existing implementations. I believe this is one criteria to be able to go to REC that the spec is not yet meeting.

Two questions may help me understand your position, which is still fuzzy to me: Are you ok with the spec describing where browsers do use the same defaults? Are you ok with the idea to converge on the same defaults for browsers? Or at least to flag where implementations may defer?

youennf avatar Mar 01 '21 18:03 youennf

What's the point of constraints if the defaults are known?

The point of constraints is to 1) not have to expose every possible device configuration, and 2) not have to write your own algorithm similar to constraints processing. It is perfectly reasonable to question these decisions, but that is a separate discussion to whether or not it would be useful for constraints to be more predictable and testable.

There's another contradiction: defaults may be device dependent, like channelCount in Firefox.

Not a contradiction. You can always downsample to 1 or have a default that is device-dependent like defaulting to the maximum of the device's capability. The default does not have to be "exact" either, it could be "give me what is closest to the default when not specified". For example if you have a 360p camera the "default" could still be 480p and you could open in what is closes to 480p, which would be 360p. I assume the browsers don't crash if such a camera exists? Similarly if we think stereo is the future, we could have the "default" be 2 channels but have mono devices open with 1 channel because 1 is the closest value to 2 that is possible with that device.

Not to pile on the contradictions, but: applications should know better than the user, yet somehow don't know to constrain the settings they rely on to not break miserably?

Not a contradiction. This has more to do with testability. If an app developer writes code and tries it out with a couple of devices on a couple of browsers and consistently get the same result they might think that the behavior is well-defined and have no idea that some user in the wild is able to change this in OS settings. They might not go through the spec and specify defaults for every possible constraint available, like channel count, they'll probably fix it on a case-by-case basis if problems crop up.

Embarassingly enough, Chrome shipped its stereo=1 hack and so by the sound of it has probably been upsampling to stereo when talking to Firefox. Not what was wanted, but slipped through the cracks. This illustrates that everything is not sufficiently tested.

henbos avatar Mar 01 '21 19:03 henbos

Please, do not spec a set of default constraints or channel counts. Give us whatever the underlying system gives us for default channel count, frame size, frame rate, etc. The OS knows better than the user agent does.

Perhaps a stereo audio input device switches to a mono mode when the user agent assumes mono with no constraints given. The web application doesn't know that this is undesirable as it just wants the default behavior of the audio device. The web application isn't going to request stereo, because it may end up with a wasteful upmix of mono audio devices to stereo. Setting default constraints rather than letting the underlying system determine behavior prevents us from using a sensible system-level default, perhaps even one that the user configured.

As an example, the spec does not say whether, if echoCancellation is supported, echoCancellation should be on or off. I would guess browsers turn echo cancellation on by default, and many applications are relying on it.

They do, and the effects have been very bad for audio quality. I feel strongly that audio on the web would be better off if these options were not enabled by default. If there is any doubt, you can watch any news broadcast from COVID times with remote guests behind some WebRTC-based call. Even if they use an IFB/IEM/earphone to prevent feedback, audio quality is damaged because these DSP algorithms were enabled by default and few developers know to turn them off.

I believe the base specifications should not assume use cases. (For example, echoCancellation by default assumes some sort of bi-directional audio communication where feedback could occur.) Specifying default constraints make assumptions about the use case, as well as the hardware capabilities, user preferences, and application intent. This isn't good, in my opinion.

Ideally, the web layer is as thin as reasonable, giving us a cross-platform API that interferes as little as possible. To that end, I think the default channelCount, and other stream constraints, should not be put into the spec. If an implementer needs to set default constraints for some reason, such as the base system not supporting a default stream format from the capture device, then they should also be free to implement as they see fit.

bradisbell avatar Mar 01 '21 22:03 bradisbell

Is Firefox deciding to use channelCount = 2 if device allows it, like {channelCount: 2}? Or is it that Firefox is using whatever the OS default os, like {channelCount: 'default'}?

@youennf I'm not aware of a cross-device "OS default". Seems per device on mac and Windows (didn't try linux). My BRIO appears only settable to 2 (max?) channels, although sampleRate can be changed:

image I believe Firefox takes the max channels offered by the specific device and caps it at 2, due to bug 1393401.

As I said, I do not think the spec is precise enough for implementors to be able to interop with existing implementations. I believe this is one criteria to be able to go to REC that the spec is not yet meeting.

The spec is precise, leaving user agents in charge of defining the underlying system's "platform defaults".

Platforms vary, and devices vary, so having browsers vary in same-platform + same-device situations, seems more like a healthy reminder for apps not to assume every system and device will be the same, than a bug. User agents are allowed their own interpretation of a platform and its devices (e.g. a privacy-focused browser may offer a limited and unrecognizable view). So I think I'm rejecting your definition of "interop" here being that every browser must represent the underlying system's resources the same way.

Constraints are precise and interoperable, whether applying specific settings or min/max ranges around the underlying system's defaults.

Are you ok with the spec describing where browsers do use the same defaults?

No. I share @bradisbell's concerns that we may have been short-sighted in our clamping of default platform capabilities to accommodate one particular sink (RTCPeerConnection), and that we shouldn't cement them in the spec.

Are you ok with the idea to converge on the same defaults for browsers? Or at least to flag where implementations may defer?

No, I don't see the interop problem you're solving.

jan-ivar avatar Mar 02 '21 03:03 jan-ivar

Why do users have to be able to override defaults? (I mean, other than "which device")

There's another contradiction: defaults may be device dependent, like channelCount in Firefox.

Not a contradiction. You can ... have a default that is device-dependent like defaulting to the maximum ... capability

@henbos Sure, that works up to 2 if we pick channelCount 2, but not 1. But what's your goal? E.g. why restrict user overrides if we're allowing channelCount to change through device switching anyway? Where's the invariant?

If an app developer writes code and tries it out with a couple of devices on a couple of browsers and consistently get the same result they might think that the behavior is well-defined and have no idea that some user in the wild is able to change this in OS settings.

...or they may not have tested enough devices or not devices sold next week. It's the same problem (if you default to 2).

But I'd rather hear about the user in the wild: How were they able to configure their microphone in a way that is unique enough to break this app? Did their customization have absolutely no impact on any site they visited? Or did it work on the ones that mattered to them (e.g. audio sites) and had no change on others (web conference ones)? I think that's the use case.

Chrome shipped its stereo=1 hack and so by the sound of it has probably been upsampling to stereo when talking to Firefox. Not what was wanted, but slipped through the cracks. This illustrates that everything is not sufficiently tested.

Wrong spec. My understanding of the Chrome bug is it affected all audio input sources to RTCPeerConnection, not just stereo microphones, hence clearly a bug, and not related to defaults. Tests can use constraints and don't depend on defaults AFAIK.

jan-ivar avatar Mar 02 '21 04:03 jan-ivar

The web application isn't going to request stereo, because it may end up with a wasteful upmix of mono audio devices to stereo.

I don't think we should ever upsample, regardless if we're talking about the defaults or constraints, it should be a preference, but this would be capped by what the device is capable of, ensuring you never get upsampling. We don't upsample resolution so I don't see why we would upsample channel count.

I believe the base specifications should not assume use cases. (For example, echoCancellation by default assumes some sort of bi-directional audio communication where feedback could occur.) Specifying default constraints make assumptions about the use case, as well as the hardware capabilities, user preferences, and application intent.

Even if we don't specify defaults in the spec they still have to be specified in implementation code. You can't get around defaults. The question isn't "defaults or no defaults?" the question is "well-defined defaults or unspecified defaults?". In cases where there are meaningful and configurable OS defaults we could talk about whether or not those should override the browser defaults, but I'm not sure there are meaningful OS/user choices beyond which device to pick.

One reason we might be disagreeing is me not buying the premise that there are meaningful defaults, so if we're going to pick arbitrary ones, we might as well all agree on what those arbitrary defaults are for the sake of predictability. I proposed defaulting to 1, but another option is defaulting to "maximum channels that the device is capable of".

Give us whatever the underlying system gives us for default channel count, frame size, frame rate, etc. The OS knows better than the user agent does.

On one hand I hear that the OS provides meaningful defaults...

I'm not aware of a cross-device "OS default". Seems per device on mac and Windows (didn't try linux). My BRIO appears only settable to 2 (max?) channels, although sampleRate can be changed:

... and on the other hand I hear that configurable defaults is only a subset of the capabilities and that it varies by device and platform and maybe you can't configure it at all because you'll only get the maximum? Which one is it?

It seems like the strongest case for not having defaults is being able to configure what the defaults is either by user knowing best or by OS knowing best, but from this discussion I really can't tell if the OS or user does.

Platforms vary, and devices vary, so having browsers vary in same-platform + same-device situations, seems more like a healthy reminder for apps not to assume every system and device will be the same, than a bug.

Devices varying is inherent to the problem we are trying to solve. OS or OS settings varying is only a problem if we don't have well-defined defaults.

Wrong spec.

It was just an example proving the point about testability.

henbos avatar Mar 02 '21 08:03 henbos

Today you could have the same machine, same device, same OS, same OS settings and the only thing that is different is which browser you are running - and you might get different results. This hurts testability and predictability. This is what the discussion should be about. If we don't care about that, then so be it, but let's decide based on what we want to solve rather than fear of upsampling or changing our minds later about what the defaults should be.

henbos avatar Mar 02 '21 08:03 henbos

Let's take the example of sampleRate. Web Audio says the following: If contextOptions.sampleRate is specified, set the sampleRate of this AudioContext to this value. Otherwise, use the sample rate of the default output device.

The mediacapture does not say anything while exposing a similar API as Web Audio (optional sampleRate value). As it is, this spec is not reaching the minimal amount of precision that other specs provide and that implementers need.

youennf avatar Mar 02 '21 09:03 youennf