webxr Add Scene Description API for automation and a11y

Right now rendered scenes are pretty opaque. They are hard to parse by machines to extract information about what is being shown and where it is in 3D space.

I would like to propose a solution where we have an object graph created by the user and attached to an entry point on the session each object is assigned a colour. And a stencil buffer where these colours are rendered so that the device knows what is on the scene.

Does this sound useful?
Does it sound interesting?
The spec side is pretty light but we should describe what is expected in the graph
- What kind of information would you want, e.g. visibility, bounding-box
- what kind of description is useful, we should make this extensible
- Should the user pick the colours or should it be generated by a hash
What carrots should we provide to get developers to actually use it?

/facetoface

Mar 12 '24 18:03 AdaRoseCannon

Mentioned in an editors meeting: There's a possibility that this information could also be used as a generic input assist, where we could start surfacing which semantic object a target ray intersected with select events. This could make some types of inputs easier for developers.

Mar 19 '24 18:03 toji

/facetoface with an update

Nov 11 '25 19:11 AdaRoseCannon

See also: https://github.com/immersive-web/proposals/issues/86

Nov 21 '25 18:11 mkeblx

Todo: Compare using a JavaScript object, vs using a HTML tree inside the canvas element.

Nov 21 '25 18:11 AdaRoseCannon

Stereo buffers probably needed.

If the buffers are the same time you can in theory do it all in one pass, in XR we do multisample we do not want to do any of that on this buffer so it would be tricky

Simplified geometry and fragment shaders should make it cheaper.

Analogous to velocity buffers

Nov 21 '25 18:11 AdaRoseCannon

Firing click events on elements would enable hit testing for free.

Nov 21 '25 18:11 AdaRoseCannon