dom-overlays icon indicating copy to clipboard operation
dom-overlays copied to clipboard

Use cases for DOM Overlays in VR

Open AdaRoseCannon opened this issue 3 years ago • 31 comments

It would be useful to collect some use cases for how DOM Overlay's are being used in VR for potential implementations.

This example by @mrdoob is a great use case: https://twitter.com/mrdoob/status/1385184290867187715

AdaRoseCannon avatar Apr 22 '21 11:04 AdaRoseCannon

So for me, a major part of this is the ability to just make UI using web technology instead of trying to make UI components from scratch. I can create a Vue app with Vuetify and have that serve as a control panel for desktop and VR users alike.

Another major feature is the ability to display and have interactive media, such as screenshare or synced video watching. This allows us to rely on existing web implementations instead of trying to manually program systems for use in virtual worlds.

For example, here is a web app for street videos being used in a VR world. We can and do also throw up movies on similar screens.

image

digisomni avatar Apr 27 '21 00:04 digisomni

Making UI elements with Vue is a good usecase, for the other ones:

In this case videos are best done using WebXR Layers for best performance and user experience.

Screenshare is interesting, @cabanier do you think a WebRTC Video can be put through a WebXR layer? Since you set the playback on the video element using:

videoEl.srcObject = remoteStream;

AdaRoseCannon avatar Apr 27 '21 09:04 AdaRoseCannon

Screenshare is interesting, @cabanier do you think a WebRTC Video can be put through a WebXR layer?

Yes, any video element can become the source for a video layer.

cabanier avatar Apr 27 '21 18:04 cabanier

Some broad use cases :

  • forms ( login, messaging UI... )
  • caption ( painting title, nametag above player... )
  • site navigation tools and information ( current page, links... )
  • input tools ( checkboxes, virtual keyboard... )
  • media display ( image, video, iframes... )
  • big text display ( IDE, ebook, blog... )
  • file system ( import a file, save as... )

felixmariotto avatar Apr 28 '21 07:04 felixmariotto

A big issue with today's definition of DOM Overlays is that they are drawn on top of the VR scene. Unless we change that, I suspect that they can only be used for non-interactive content.

cabanier avatar Apr 28 '21 15:04 cabanier

Unless we change that, I suspect that they can only be used for non-interactive content.

@cabanier why ? because the controllers would be drawn behind the UI ? A solution may be for the browsers to provide a stencil buffer so we can write on it to tell which parts of the top-drawn UI should be trimmed out to show what's behind ? Blind guess, I don't know if it's feasible.

felixmariotto avatar Apr 28 '21 15:04 felixmariotto

@cabanier - https://immersive-web.github.io/dom-overlays/#xrsessioninit currently says "The DOM content MUST be composited as if it were the topmost content layer. It MUST NOT be occluded by content from the XRWebGLLayer or by images from a passthrough camera for an AR device."

This requirement intentionally does NOT apply to UI elements drawn by the UA directly. For example, it would be OK for the UA to provide a visible pointer ray, target reticle on the DOM layer, and/or a hand/controller model while the user appears to be interacting with the DOM overlay. This is also the only way to support interactions with cross-origin content in an iframe since the application isn't allowed to get poses in that case, so the application couldn't draw a pointer ray that intersects interactive cross-origin content. (See Event handling for cross-origin content. Note that the pose restriction doesn't apply if the UA treats cross-origin content as noninteractive.)

For use cases, in general browsers provide a lot of functionality for HTML pages that would be difficult to replicate in applications, for example:

  • Accessibility features including text-to-speech support such as screen readers. (This can be helpful for visually impaired users, including someone who can't use their prescription lenses in a headset.)
  • Automatic text translation. (The application may not directly support the user's native language.)
  • Displaying third-party content that can't be accessed by the application due to cross-origin restrictions. (Example: live chat from a streaming site.)

klausw avatar Apr 28 '21 18:04 klausw

In general, the floating-screen use case isn't covered in much detail in the current version of the specification. This is open for suggestions and/or additional features, for example it could be useful to provide a per-frame status value if the UA and application should cooperate on drawing pointer rays, or a stencil mask if appropriate.

klausw avatar Apr 28 '21 18:04 klausw

This requirement intentionally does NOT apply to UI elements drawn by the UA directly. For example, it would be OK for the UA to provide a visible pointer ray, target reticle on the DOM layer, and/or a hand/controller model while the user appears to be interacting with the DOM overlay.

I agree but we would need to change the spec to allow that. There's also the issue that people will want more than one surface which is problematic because of DOM Overlay's use of the full screen API

cabanier avatar Apr 28 '21 19:04 cabanier

I agree but we would need to change the spec to allow that.

Changing the spec is definitely an option as long as it's not inherently incompatible with current usage. For example, the spec could easily be augmented to say how such elements should be handled for the "floating" type, and clarifying that the UA isn't expected to draw such affordances for the "screen" type where there's no ambiguity about the touch location.

There's also the issue that people will want more than one surface which is problematic because of DOM Overlay's use of the full screen API

DOM Overlay doesn't require using the Fullscreen API, but it is restricted to a single element / surface. Input disambiguation is already quite complex with just one element. As we had discussed during previous occasions, it seems feasible to add a DOM surface type to the Layers API for noninteractive elements, but fully interactive content on multiple arbitrarily-placed surfaces seems quite tricky.

klausw avatar Apr 28 '21 20:04 klausw

A big issue with today's definition of DOM Overlays is that they are drawn on top of the VR scene. Unless we change that, I suspect that they can only be used for non-interactive content.

More than that, it's not possible to make apps and other nifty things and toss them in the world then. For example, there are movie screens and radios.

One of the greatest things about being able to add websites into a scene is that you can take traditional (WebXR unrelated) web content and make it relevant to the world. From livestreams and screenshare to things like internet radio. e.g. http://radio.garden/

digisomni avatar Apr 28 '21 20:04 digisomni

There's also the issue that people will want more than one surface which is problematic because of DOM Overlay's use of the full screen API

DOM Overlay doesn't require using the Fullscreen API, but it is restricted to a single element / surface. Input disambiguation is already quite complex with just one element. As we had discussed during previous occasions, it seems feasible to add a DOM surface type to the Layers API for noninteractive elements, but fully interactive content on multiple arbitrarily-placed surfaces seems quite tricky.

No, DOM Layers will allow for fully interactive content as long as its same origin. Input and the drawing of the controllers is handled by the UA.

cabanier avatar Apr 28 '21 20:04 cabanier

In general, the floating-screen use case isn't covered in much detail in the current version of the specification. This is open for suggestions and/or additional features, for example it could be useful to provide a per-frame status value if the UA and application should cooperate on drawing pointer rays, or a stencil mask if appropriate.

If it helps any: I will say the floating screen use case is extremely important. It allows us to make interesting interactive UI out of 2D web elements.

Something that's always been quite good and interesting was the pioneering of Microsoft with the Cliff House. It took the Hololens interactivity and put that into WMR. I think more UIs should have the ability to snap and place windows but also get them to follow you, letting the user pick and choose what they would like to make a UI element. Think of the power of being able to play with your screen space on Windows or Linux and then imagine that level of control in VR. Spatial web windows allow for us to do something akin to a window manager for VR.

digisomni avatar Apr 28 '21 20:04 digisomni

There's also the issue that people will want more than one surface which is problematic because of DOM Overlay's use of the full screen API

DOM Overlay doesn't require using the Fullscreen API, but it is restricted to a single element / surface. Input disambiguation is already quite complex with just one element. As we had discussed during previous occasions, it seems feasible to add a DOM surface type to the Layers API for noninteractive elements, but fully interactive content on multiple arbitrarily-placed surfaces seems quite tricky.

No, DOM Layers will allow for fully interactive content as long as its same origin. Input and the drawing of the controllers is handled by the UA.

Thanks for clarifying. Would it be more accurate to say that an implementation or API can basically choose two out of these three features for DOM content, but not all at once?

  • interactive
  • cross-origin
  • arbitrary layer placement and occlusion

klausw avatar Apr 28 '21 21:04 klausw

Thanks for clarifying. Would it be more accurate to say that an implementation or API can basically choose two out of these three features for DOM content, but not all at once?

  • interactive
  • cross-origin
  • arbitrary layer placement and occlusion

Yes :-)

cabanier avatar Apr 28 '21 21:04 cabanier

There's also the issue that people will want more than one surface which is problematic because of DOM Overlay's use of the full screen API

DOM Overlay doesn't require using the Fullscreen API, but it is restricted to a single element / surface. Input disambiguation is already quite complex with just one element. As we had discussed during previous occasions, it seems feasible to add a DOM surface type to the Layers API for noninteractive elements, but fully interactive content on multiple arbitrarily-placed surfaces seems quite tricky.

No, DOM Layers will allow for fully interactive content as long as its same origin. Input and the drawing of the controllers is handled by the UA.

Thanks for clarifying. Would it be more accurate to say that an implementation or API can basically choose two out of these three features for DOM content, but not all at once?

* interactive

* cross-origin

* arbitrary layer placement and occlusion

To summarize: It allows us to make interactive in-world panels given they're hosted by the same domain, but also import content in a non-interactive way if it's not "secured" by being on the same domain?

So far, that looks like a quite reasonable starting point, then. :)

digisomni avatar Apr 28 '21 23:04 digisomni

To summarize: It allows us to make interactive in-world panels given they're hosted by the same domain, but also import content in a non-interactive way if it's not "secured" by being on the same domain?

What do you mean by 'it'? These are 2 different APIs...

cabanier avatar Apr 28 '21 23:04 cabanier

"It" being web content, the two different contexts being a. on the same domain or b. not on the same domain.

What I gather is that if content is on the same domain, it can then be interactive and spatial at the same time. If it's not on the same domain, then it can only be interactive and on the top-most layer; or it can be spatial but non-interactive.

digisomni avatar Apr 28 '21 23:04 digisomni

Correct. cross-domain/top rendering only = DOM Overlay same-origin/arbitrary placement = DOM Layers

cabanier avatar Apr 29 '21 00:04 cabanier

Awesome, well I will have to say that I am interested in seeing both implementations for WebXR (for VR headsets). Existing web tech can save developers a lot of time where apps and UI are involved.

digisomni avatar May 25 '21 17:05 digisomni

Youtube players are cross-origin iframes and there seems to be quite some demand to be able to get those into WebXR experiences.

If I understand correctly, that would be possible with DOM Overlay, but does not allow arbitrary placement. Since the player playback is controlled via postMessage(), non-interactive is fine, but from what I've read, it's impossible to place the YT player against a wall for example. Even without occlusion, it would be nice to still be able to arbitrarily place the player in 3D space.

(See also the issue on the YouTube issue tracker https://issuetracker.google.com/issues/200299143)

Squareys avatar Sep 19 '21 06:09 Squareys

CSS2DRendering/CSS3DRendering (such as https://threejs.org/docs/#examples/en/renderers/CSS2DRenderer) could be an interesting use case in combination with DOM Overlays, perhaps leveraging 2D html content with ARIA and all that comes with HTML. I am already using this concept for annotating 3D models and making 3D models more accessible to users in desktop and mobile WebXR applications. It would be great to utilise this in VR/XR as well without having to write "legacy/alternative" components for VR annotations and submenus.

We also have iFrame systems in combination with CSS3D for Youtube players (as mentioned by @Squareys ), which is a nice use case for iFrames amongst other things, but we need to rely on MP4 HLSJs / VideoJS / Dash to texture implementation as the iframing is not compatible with the VR browsers.

camelgod avatar Jan 20 '22 14:01 camelgod

Screenshare is interesting, @cabanier do you think a WebRTC Video can be put through a WebXR layer?

Yes, any video element can become the source for a video layer.

When I use video layer for webrtc video track, video rendering is stuttering very much. It seems to be not able to control frame pacing

whatisor avatar Apr 04 '22 20:04 whatisor

Screenshare is interesting, @cabanier do you think a WebRTC Video can be put through a WebXR layer?

Yes, any video element can become the source for a video layer.

When I use video layer for webrtc video track, video rendering is stuttering very much. It seems to be not able to control frame pacing

Can you link to an example?

We recently made some fixes in this area.

cabanier avatar Apr 04 '22 20:04 cabanier

it is very big project, so cannot share. it is just render a stereo webrtc stream video on WebXR video layer( quad) on oculus headset.

If rendering sttream by classic way( gl texture), it is fine but quality is worse.

whatisor avatar Apr 04 '22 20:04 whatisor

This is an issue on the Oculus browser side. Can you reach out to me on the WebXR discord? There are a couple of things I'd like you to try.

cabanier avatar Apr 04 '22 20:04 cabanier

who are you there?

whatisor avatar Apr 04 '22 20:04 whatisor

who are you there?

Rik Cabanier (Meta)

cabanier avatar Apr 04 '22 20:04 cabanier

Hi, just saw this thread and I will add the use case I am developing. Transition from 2D Web to a 3D universe should be possible through WebXR DOM overlay (I'm betting the farm on it...). I'm doing my R&D developing https://umniverse.com (upper right 3D button). DOM Overlay was a great idea, hope W3C and browser developers explore its full potential.

franciscoreis avatar May 14 '22 09:05 franciscoreis

Incredibly useful for business. Remote VR meetings to be able to see and interact with web pages inside a virtual conference room space. We are also using Appetize.io to project real functioning mobile devices inside browsers (and with some limited success, inside Spatial.io from within an Oculus headset) so that visitors do not need to download the actual app to their phone to experiment with it. We want the future of work to be possible inside VR, using WebXR so that native headset apps are not required to be installed.

VoiceOfSoftware avatar Nov 11 '22 05:11 VoiceOfSoftware