standards-positions icon indicating copy to clipboard operation
standards-positions copied to clipboard

WebXR Raw Camera Access API

Open bialpio opened this issue 3 years ago • 4 comments

Request for position on an emerging web specification

  • WebKittens who can provide input: @grorg as a WebKitten involved in Immersive Web CG/WG

Information about the spec

  • Spec Title: WebXR Raw Camera Access Module
  • Spec URL: https://immersive-web.github.io/raw-camera-access/
  • GitHub repository: https://github.com/immersive-web/raw-camera-access
  • Explainer (if not README.md in the repository): https://github.com/immersive-web/raw-camera-access/blob/main/explainer.md

Design reviews and vendor positions

  • TAG Design Review: https://github.com/w3ctag/design-reviews/issues/652
  • Mozilla standards-positions issue: https://github.com/mozilla/standards-positions/issues/667

Bugs tracking this feature

  • WebKit Bugzilla: https://bugs.webkit.org/show_bug.cgi?id=208988

Anything else we need to know

The WebXR Raw Camera Access Module extends the capabilities of the core WebXR Device API by allowing the sites to request "camera-access" feature when creating XR sessions. The feature allows the sites to obtain access to camera pixels (via an integration with WebGL, exposing the pixels as WebGLTextures). This capability has been requested for a long time by developers.

bialpio avatar Jul 15 '22 18:07 bialpio

Uninformed question: Why is this different than any other camera access API on the web?

litherum avatar Jul 15 '22 18:07 litherum

It allows the sites to obtain camera images that are synchronized with WebXR's XRPoses. If a site were to obtain camera images in some other manner, it would be unable to correlate them with spatial data that it can get by integrating with WebXR.

bialpio avatar Jul 15 '22 18:07 bialpio

It allows the sites to obtain camera images that are synchronized with WebXR's XRPoses. If a site were to obtain camera images in some other manner, it would be unable to correlate them with spatial data that it can get by integrating with WebXR.

Decently accurate poses can be deduced from arbitrary/in-the-wild monocular videos quite easily at this point. In a few months this may even be possible in real time, using open source code.


I do hope that we get this API soon. This point from the explainer sums up my needs:

Run custom computer vision algorithms on the data obtained from the camera texture. It may for example enable applications to semantically annotate regions of the image, for example to provide features related to accessibility.

I am probably biased in my thinking due to my specific use case, but this seems essential for any AR experience that integrates AI capabilities.

The permission modal should obviously be very clear about what is being granted, and all the usual stuff to ensure ongoing consent/understanding (e.g. red dot recording indicator type stuff).

josephrocca avatar Jan 04 '24 18:01 josephrocca

Discussing this with colleagues it’s not clear to us why this needs to be a separate API from getUserMedia(). At least it seems that with some small changes getUserMedia() could work as well. In addition some of the use cases this is used for maybe better served by new higher level WebXR APIs.

AdaRoseCannon avatar Jan 16 '24 09:01 AdaRoseCannon

Given https://github.com/WebKit/standards-positions/issues/37#issuecomment-1893366444, it's my intention to mark this as "opposed" as the WebKit's position. However, I'll give folks a week to provide more information and/or object otherwise.

marcoscaceres avatar May 18 '24 04:05 marcoscaceres