mediacapture-main icon indicating copy to clipboard operation
mediacapture-main copied to clipboard

ResizeMode (crop-and-scale) is underspecified

Open henbos opened this issue 5 years ago • 19 comments

When downscaling a resolution, there are at least three ways you could do it if the new aspect ratio is not the same as the original one:

  • Stretch it.
  • Don't stretch it, achieve new resolution by "zooming in" to the middle of it.
  • Don't stretch it, achieve new resolution by making the picture smaller and adding black bars to fill out the part of the rectangle that doesn't have the picture.

The spec downscaling and/or cropping but does not say how this is achieved.

@guidou what does Chrome do?

henbos avatar May 02 '19 13:05 henbos

Chrome uses the largest centered rectangle with the same aspect ratio of the original frame that fits entirely inside of the target resolution. So, basically, first downscale the original (as little as possible) while maintaining the aspect ratio until the target width or height is matched, and then crop to match the other dimension if necessary.

guidou avatar May 03 '19 10:05 guidou

Great, I think that was what @jan-ivar was hoping for, yes?

henbos avatar May 03 '19 10:05 henbos

WebKit doesn't trim, it uses letterbox scaling: the video's clean aperture rectangle is scaled to fit completely in the destination size. If the destination pixel aspect ratio is different, the remainder of the destination is filled with black.

eric-carlson avatar May 03 '19 15:05 eric-carlson

@henbos Yes, that's what I expected, and complies with my "more modes" reading of the spec:

"The user agent MAY use cropping and downscaling to offer more resolution choices than this camera naturally produces. The reported sequence MUST list all the means the UA may employ to derive resolution choices for this camera."

In other words, the purpose is to simulate camera modes. Since I've never seen a camera naturally emit a track containing black bars, I would not expect that output.

I was even going to say that I thought black bars go against the "don't invent pixels" rule we have, but then I couldn't find such a rule. Turns out we only have it in WebRTC: "The media MUST NOT be upscaled to create fake data that did not occur in the input source".

An oversight? I kinda think that's implied. Do we want to add language to make this clearer?

jan-ivar avatar May 06 '19 17:05 jan-ivar

@eric-carlson Thanks for the info. The WebKit vs Chrome behavior here seems a bit of a web compat issue, which probably speaks to that the spec could be clearer.

jan-ivar avatar May 06 '19 17:05 jan-ivar

In discussion in the WG, my memory says that we did not want to permit distortion, so stretching was right out. Crop-and-scale was intended to be "crop to desired aspect ratio, then scale to desired # of pixels". We also decided not to do black bars. For reference, the CSS object-fit property is described here: https://www.w3schools.com/css/css3_object-fit.asp - we decided to not to "fill" or "contain".

alvestrand avatar May 09 '19 13:05 alvestrand

crop-and-scale was supposed to do the same as CSS "object-fit: cover", except that we don't want to upscale ever.

alvestrand avatar Sep 12 '19 14:09 alvestrand

I'l just make it say "The media MUST NOT be upscaled to create fake data that did not occur in the input source".

henbos avatar Sep 12 '19 14:09 henbos

Does this only affect MediaStreamTrack from getUserMedia()?

Chromium does provide resizeMode constraint when the track is derived from HTMLVideoElement.captureStream() though AFAICT no value that is set has any impact on the resulting video track.

Further, Chromium provides getSettings() for tracks derived from HTMLVideoElement.captureStream() which ultimately outputs incorrect results for width and height when the underlying file is a WebM file having variable resolution frames output by MediaRecorder then replayed at HTML <video> element at Chromium, only the initial width and height are output. For example, given a 2 second WebM video where the first 1 second is 400x300 and the second second is 300x150 getSettings() width and height output 400, 300 for the duration of the video.

guest271314 avatar Sep 13 '19 23:09 guest271314

Sometimes it makes sense to have the same constraints for tracks sourced by other means than GUM, and sometimes it does not. It has not been clear in the past. Going forward: Each spec needs to explicitly say which constraints are applicable to its tracks. I.e. captureStream() has to explicitly say that it supports crop-and-scale if that is something it does.

Chromium assumed that constraints applied no matter the source as long as they "made sense", but different browsers may have made different assumptions.

I believe chromium implements reading width and height by looking at frames, and reporting what it has seen recently, but I don't know the details. The spec should be clear what to do here, not surprised if chromium has a bug here.

henbos avatar Sep 16 '19 09:09 henbos

Reopen to update the sentence according to TPAC conclusion:

Issue 584 Safari has potential issues with downscale restriction in multiple apps accessing to camera. RESOLUTION: Remove upscale from the proposed sentence and merge

henbos avatar Sep 26 '19 14:09 henbos

To this day if you ask for 640x360 which is a 16:9 aspect chrome feels its within the spec to take 640x480 and zoom/crop it to get the desired res (Even when resizeMode is "none") I think we are missing a "scale" resizeMode or a way to say you would prefer to downscale the first encountered native resolution of desired aspectRatio than to crop an unrequested aspect ratio. There is no way to currently express this. For instance: I would like it to downscale 1280x720 to 640x360 over cropping 640x480 to 640x360. Having the host scale the image is a better use of resources than sending an unwanted resolution.

anthmFS avatar Jun 22 '20 22:06 anthmFS

If there are expected to be resizes should not the MediaStreamTrack have a resize event and handler?

If a MediaStreamTrack is captured the pixel dimensions can change multiple times during the first several seconds of the capture, whether constraints are set or not.

guest271314 avatar Jun 22 '20 23:06 guest271314

To this day if you ask for 640x360 which is a 16:9 aspect chrome feels its within the spec to take 640x480 and zoom/crop it to get the desired res (Even when resizeMode is "none")

@anthmFS This sounds like a bug in Chrome. Would you mind filing one?

The spec seems pretty clear: none = "This resolution is offered by the camera, its driver, or the OS."

jan-ivar avatar Jan 11 '21 15:01 jan-ivar

Reopen to update the sentence according to TPAC conclusion:

Issue 584 Safari has potential issues with downscale restriction in multiple apps accessing to camera. RESOLUTION: Remove upscale from the proposed sentence and merge

@youennf If this summary is correct, would adding the word "upscaling" to the following existing sentence suffice?

  • "The UA MAY disguise concurrent use of the camera, by cropping and/or downscaling to mimic native resolutions when "none" is used, but only when the camera is in use in another browsing context."

jan-ivar avatar Jan 11 '21 16:01 jan-ivar

@jan-ivar I did and they sent me here. https://bugs.chromium.org/p/chromium/issues/detail?id=1079052

All I want to see is a way to show preference over a native aspect ratio and scale it even if that means down. It's horrible and bug-prone to some devices that right now I have to ask for some strange res like 644x362 to trick it into doing this and it still doesn't always work. There are many use cases for 640x360 where you don't want to crop because it's a zoom effect on the user and you end up with their head not fitting in the picture and scaling 1280x720 down at the client is better than sending a larger image for no reason.

The same applies to screen sharing. I would like to ask for 1920x1080 and have the client scale and letterbox it for me but instead, it crops it, there is not really a valid use case for cropping a screen share.

In the end, these params should allow the complete suite of flexibility for the programmer to make the camera do whatever they want.

anthmFS avatar Mar 25 '21 21:03 anthmFS

How about "scale-to-aspect" as a separate param as well?

The browser already downscales, preserving the aspect when it detects a network issue, so clearly, this is already implemented. There simply needs to be a way to do it on purpose so you have full control of the resolution you want to capture and send.

Never going backward should not be an overarching rule. The reality is that cameras often have very few native resolutions and browsers have plenty of power to resize images more than they have network resources for oversized images.

There should be at least one strategy to pick the best native resolution matching the requested native aspectRatio and scale it as desired up or down. Forcing an aspectRatio by zooming a mismatched aspectRatio causes nearly unusable circumstances because it zooms so much.

Regarding screen share. The black bars rule needs to have an exception for screen share since the window could have any resolution and just zooming the screen share makes no sense. If you apply no constraints it does letterbox of the entire window at its native resolution but as soon as you apply constraints it does the same undesired zooming. If I am sharing my window on a 4k monitor and I know it's going to be sent to people on a 1080p monitor it makes more sense to use constraints to lower the size of the image before I send it over the internet.

anthmFS avatar May 24 '21 15:05 anthmFS

there is not really a valid use case for cropping a screen share.

@anthmFS Are you sure it's cropping and not an artifact of container layout? Screen capture is a different spec, which says "The user agent MUST NOT crop the captured output."

jan-ivar avatar May 24 '21 22:05 jan-ivar

Didn't see this question for a whole year, sorry.

If Chrome gets video params update applied with a specified width and height, it will crop and scale to that res cutting off part of it depending on the aspect ratio of the window. So seems like that's breaking the rule.

If you don't send media params you continue to get the native resolution.

The request is for a way to scale it to some constraints like if you supply a width and height, to make the output small enough to fit inside that rectangle but do not change the proportions or crop it at all.

Chrome used to just do this which seemed more right I could say width: 1280 height: 720 and it would just send the screen share at a size that would fit in such params but didn't actually alter anything other than the scale.

But I guess this is tangential but slightly offtopic

anthmFS avatar Jul 12 '22 23:07 anthmFS

As part of triage, I've added a PR to address the TPAC resolution in https://github.com/w3c/mediacapture-main/issues/584#issuecomment-535544399.

jan-ivar avatar Sep 20 '23 23:09 jan-ivar

@anthmFS this issue will close with #971, so please open a new issue to keep discussing the 640x360 issue.

But FWIW, the spec says about crop-and-scale: "This resolution is downscaled and/or cropped from a higher camera resolution by the User Agent, ..." — which IMHO already allows any user agent to derive 640x360 from native resolutions higher than 640x480 to produce more desirable results.

That's my reading of: "For every possible settings dictionary of [unconstrained copy of track] compute its fitness distance", which I think says fitness distance operates on the full set of settings the UA (not the camera) supports. cc @guidou

jan-ivar avatar Sep 20 '23 23:09 jan-ivar

@jan-ivar fully agree with you.

@anthmFS AFAICT, Chrome does not return non-native resolutions if resizeMode "none" is used. If you find such a case, please file a bug at crbug.com.

guidou avatar Sep 21 '23 13:09 guidou