webxr icon indicating copy to clipboard operation
webxr copied to clipboard

Should requestSession resolving promise count as user activation?

Open klausw opened this issue 4 years ago • 13 comments

Apologies in advance in case this got discussed previously, I didn't find specific answers in searching the archive.

Currently, entering an immersive session, or an inline session with non-default features, requires user activation according to 13.5.1 Immersiveness, and a typical way to do this is to start the session from an "Enter XR" button or equivalent in an event handler that's treated as user activated.

Applications may also have other things they want to do that needs user activation, for example starting media playback. In many cases they'd probably want this to happen when the session actually starts, but as far as I can tell it's not safe to assume that they can do so in the requestSession callback since the spec doesn't specifically say that this should count as user activated.

If the app starts playback from the "Enter XR" button, the sound may be playing while the user is still looking at consent prompts or going through a headset startup sequence. The app can do a start/stop sequence and then resume playback again in the requestSession callback, but that's a bit clunky, and also specific to media playback.

The WhatWG user interaction spec has the option "The task in which the algorithm is running was queued by an algorithm that was triggered by user activation, and the chain of such algorithms started within a user-agent defined timeframe" which seems to roughly match this scenario, with a timeout left to the user agent's discretion.

Currently, it appears that Chrome uses a 5-second timeout for click activation, so the requestSession callback will count as user activated if the user quickly answers a consent prompt. That seems error-prone, and I think it would be preferable if this would either succeed or fail consistently.

Is the intent of the spec that user agents should consistently resolve the requestSession promise in a "triggered by user activation" context, and if so, should this be stated more explicitly?

klausw avatar Jul 29 '19 17:07 klausw

Related to #316

toji avatar Jul 31 '19 16:07 toji

Broadly the editors agree that we probably want a way to carry through the user activation that allowed an immersive session to be created. Using the requestSession() promise resolution as that mechanic is slightly problematic in that a .then() can be attached to the promise at any time, including long after the initial resolve. We can't allow the returned promise to become a user activation token dispenser, so we'd have to do something else to manage it. Off the top of my head I can thing of two routes:

  1. Come up with some sensible and predictable way of expressing that it only counts as a user activation if your .then() was attached prior to the initial resolve.
  2. Add something like a sessionstart event to XR, which feels like a duplication of effort when the promise handles 99% of the same needs. At least it would pair nicely with the end event.

😒

Not really keen on either of those routes, but it's a topic that deserves wider discussion.

toji avatar Aug 01 '19 16:08 toji

Paging @rafaelcintron as this is related to a topic we once discussed that you had good insight on.

NellWaliczek avatar Aug 01 '19 16:08 NellWaliczek

Using a sessiongranted event would tie in with https://github.com/immersive-web/navigation#api-proposal, which is the proposed solution for intra-immersive navigation (or other contexts where user activation is provided by something other than the web page itself).

asajeffrey avatar Aug 01 '19 16:08 asajeffrey

Edit: sorry, the strict scoping described below doesn't seem to work after all, looks like I accidentally tested within the 5-second window of the "enter XR" button press when I saw it working. A pre-existing chained promise runs after the promise->Resolve(session)'s scope exits. So this would require at least a short timeout in the existing Chromium implementation, but I may be missing something.

Original text following.

This seems to be reasonably straightforward on the implementation side, but I'm not sure what the right way would be to phrase this for the spec. The current requestSession algorithm queues a task that ends with "resolve promise". Would it be consistent to say that this task is treated as "The task in which the algorithm is running was queued by an algorithm that was triggered by user activation" as per the activation spec, and that pre-existing chained promises count as the same task, but ones added later would be a different task?

As far as implementation is concerned, Chromium's onRequestSessionReturned implementation calls query->Resolve(session) for the success case, and I've tested that providing a scoped user activation gesture there (for immersive sessions) seems to work as expected. Code in a pre-existing then runs with activation active, but a then added later doesn't inherit this status.

FWIW, the Chromium implementation of user activations by default seems to be a bit more lenient, activations are a LocalFrame property set via LocalFrame::NotifyUserActivation , i.e. from onClick, and this transient activation state by default stays valid for 5 seconds and isn't strictly limited to a specific scope. So within this time limit, the app could chain a new then or run code from a setTimeoutand it would still count as user activation. I've tested disabling this timeout by explicitly clearing the activation state after running query->Resolve(session), and then I see the strict behavior that only a pre-existing then counts as user activation.

klausw avatar Aug 01 '19 17:08 klausw

Sorry, the strict scoping described in my previous comment doesn't seem to work after all, looks like I accidentally tested within the 5-second window of the "enter XR" button press when I saw it working, so it was reusing that user activation. A pre-existing chained promise runs after the promise->Resolve(session)'s scope exits. So this would require at least a short timeout in the existing Chromium implementation, but I may be missing something.

klausw avatar Aug 01 '19 18:08 klausw

FWIW, I like the idea of sending the event. As @asajeffrey says, this event will be needed for any of the proposals that allow an immersive session to be triggered externally. Beyond navigation, that could include

  • creating WebXR-enabled PWA's (that the user has pre-approved for immediately entering immersive mode),
  • UAs that might want to present a UI to users for entering immersive mode in parallel to whatever API the page offers, or
  • UAs that might offer a "open page in immersive mode" menu entry when "right clicking" on links in an 2D page view of an immersive browser

While these are unrelated to this issue, figuring out how to make that event work would make these use cases possible.

blairmacintyre avatar Aug 01 '19 18:08 blairmacintyre

I thought we had worked out the user activation situation as part of the fallout from #424. Can we solve media playback and others in the same manner?

RafaelCintron avatar Aug 06 '19 01:08 RafaelCintron

@RafaelCintron wrote:

I thought we had worked out the user activation situation as part of the fallout from #424. Can we solve media playback and others in the same manner?

I don't think that's quite the same thing. #424 talks about permissions/consent needed for starting the session, including sensors used for the session itself. This issue is about apps trying to do things that are not part of core session functionality, for example starting audio playback, where doing so requires user activation but the application would like to do the action at the time the session actually starts, and the current API doesn't provide a "has user activation" context when the session is actually starting.

In other words, the user activation used to start the session, i.e. pressing a "enter XR" button, happens at an earlier time, there may be a gap of several seconds until the session actually starts, for example while the user reads consent prompts or similar. By the time the session starts, this user activation is likely to have timed out (if the UA uses time-based activation attached to the browsing context), or isn't scoped correctly (if the UA ties the activation more strictly to a JS execution scope).

klausw avatar Aug 06 '19 17:08 klausw

/facetoface

(Not sure if there was consensus on how to proceed - seems like people agree this would be useful, but the mechanism is unclear, i.e. should there be a new event?)

klausw avatar Sep 15 '19 01:09 klausw

Currently sitting in a session by Mustaq Ahmed on their plans for a User Activation v2 spec, I put forth this use case and it seems like we'll have an easy way to spec this if this spec happens. At the moment this seems to be somewhat in flux across browsers, though, so doing this early might be a bad idea. That said, I'm not sure what the timeline for this is, perhaps if v2 is going to take a lot of time then we might want to have a chat with them and design stopgap spec text that works until v2 happens.

https://mustaqahmed.github.io/user-activation-v2/

Personally I feel like we should defer this to post-1.0

Manishearth avatar Sep 18 '19 05:09 Manishearth

Recapping a bit more of the conversation from TPAC:

As Klaus mentioned, the primary use case where this matters and we've heard the most requests is automatically starting media playback when the session is started. It turns out that there is a fairly straightforward workaround to doing so today that appears to be compatible with every browser. For both audio and video if the playback is started within the original user activation event and then paused, it can be resumed by the page at any time afterwards. This allows the following pattern to provide autoplay on session start (using video playback as an example):

  1. Listen to a user-activation event, such as click
  2. In the event handler, first start video playback and attach a .then() to the returned promise that pauses the video.
  3. Request an XR session.
  4. When the session resolves, start video playback again.

This behavior is now demonstrated in the Stereo Video Sample when the "Autoplay" checkbox is checked. If any vendors think this solution will not function in their browser, please reply to let us know!

Worth noting that this doesn't solve every possible use-case for user activation on session start, but it's not clear what other needs developers have for that kind of behavior. Regardless, the "correct" more general solution is likely something along the lines of User Activation v2 and not a one-off solution for WebXR.

toji avatar Sep 19 '19 06:09 toji

Recapping some further TPAC side discussion:

To a large degree, activation is mainly used as a way to prevent sites from annoying the user. Activation isn't in and of itself a security measure, many such things should also have permissions prompts associated with them.

What activation is trying to do is determine "is the user actually paying attention to this page?". Page focus isn't nearly enough for this, you could be tabbing through (perhaps to close a popup or a misclicked link), and hence we look for forms of interaction which satisfy "the user cares about this page".

When it comes to immersive UI (and also fullscreen!), there really is no alternative to "the user is paying attention", aside from "the user isn't using the device/browser at all" (at which point it doesn't matter if annoying things happen, imo). It feels like once you've entered an immersive session, we shouldn't need activation at all, entering the immersive session is already granting consent over your entire view space.

There's a bit of a wrinkle: this stops working consistently if we start supporting immersive navigation for non-declarative XR, since navigation may happen without consent. Filed https://github.com/immersive-web/navigation/issues/7.

Manishearth avatar Sep 27 '19 22:09 Manishearth