ALVR bugfix: Send remote runtime orthogonal view poses, reproject received views in the client if XrCompositionLayerProjectionView is implemented poorly in headset runtime

Summary: Fixes rendering on PFD MR OSv3.2 and other canted headsets, without FoV reductions. tl;dr, ensure the view transforms rendered by SteamVR are converted to canted view transforms via reprojection.

Detailed Information: SteamVR requires that the two render views be orthogonal and spaced by the eyes IPD, some OpenVR games also expect this and don't render correctly.

OpenXR does not have this requirement and some headsets utilize it (I believe in the PFDMR's case, this is done so that the displays are orthogonal with the passthrough cameras, at least that's my best guess). To correctly compensate for this difference, this PR does the following:

When the client sends the view transforms, the view transforms are altered to create an orthogonal, no-rotation view config which can inscribe the real view transform, and maintains the same aspect ratio as the real view transform. This new view transform is sent to SteamVR, but is transparent to client_core API users.
When reading the frame data, the client receives the pose and altered view transforms back, and is expected to render the altered SteamVR view parameters with the real OpenXR view parameter settings.
- On correctly implemented runtimes, such as the Quest's, the altered parameters can just be passed to XrCompositionLayerProjectionView.
- On incorrectly/poorly implemented runtimes, we now have to handle reprojection ourselves, which means storing the previous frame's buffer ptr and reusing it until a new frame arrives.
If view transform receiving is moved to the streamer, it should still function correctly.

Tested on: Quest Pro, Play for Dream MR

Other notes for the future:

I added some for-the-future scaffolding for my eye-gaze assisted reprojection and porting the visionOS FoV view comfort slider to OpenXR/streamer-side sliders. Slider is currently not implemented though, it might take some extra legwork since send_view_params had odd side-effects in the past wrt restarting the stream, and I'd like to make sure it's well-tested.
I might double check and see if we need to also reconcile the view transform positions themselves, since they may contain eye-in-the-gasket tracking information for pupil swim, which could instead be added to the headset pose so that view transforms remain constant. It seems probable that runtimes would just do this themselves because Tobii is obnoxious about licensing (Vision Pro also does this at a 1-2Hz interpolated interval).

Apr 30 '25 21:04 shinyquagsire23

@zmerp I noticed something odd when removing the last_buffer stuff, we report the 50+ms old frame timestamp to OpenXR as the display time, is there a particular headset which really wants reported poses and timestamps to match up, or should we just always report the real display timestamp? To me it makes more sense that the xrCompositorLayer would sorta "absorb" the fact that we report an older pose (on headsets with working compositor layers) and going forward we'd want the display time to be a real display time.

It might even result in incorrect reprojections if it tries to extrapolate a last-ms pose from that old timestamp, which we already applied our own velocities to.

May 06 '25 02:05 shinyquagsire23

Technically the timestamp is not that useful to the runtime, i expect most runtimes to ignore it. I guess it would be useful in case the runtime wants to apply some optical-flow based correction to the content of the image like animations (not the user head position!), in which case using the old original timestamp is the correct thing to do. But of course vendors may mess up and do strange stuff when e.g. the timestamp doesn't align to vsyncs. So the only thing to do is testing

May 06 '25 02:05 zmerp

Yeah for now I have it set up so that re-rendered FoVs are considered new frames with a vsync timestamp reported, and otherwise if the input and output views are the same and xrCompositorLayer does the work, then we report the older timestamp same as before.

May 06 '25 02:05 shinyquagsire23

@shinyquagsire23 What's the state of this PR? Does it avoid protocol breaks? if so we can merge (after a rebase to fix merging conflict). In the next release (v20.14) we would have new version notification popups to alert users and after that we can more freely make breaking changes.

Jun 24 '25 15:06 zmerp

No protocol breaks, it was ready to merge but I guess there's merge conflicts now

Jun 24 '25 16:06 shinyquagsire23

May I ask, what was the point of increasing the quad distance to 1000 and then doing extra scaling by it?

let quad_depth = 1000.0;

To me, this entire calculation appears trivial. At least the way it was (before custom reprojection), the entire FoV calculation and then projection by it could be removed. With a minor tweak in the vertex shader stream.wgsl (flip Y and scale by 2 (by setting W=0.5)), the entire transform matrix could be eliminated. After all, we are (were) projecting a pre-rendered texture onto a trivial quad covering the view (X,Y=±1, Z=0 - I tried, and it worked).

Even with reprojection, we'll need view_mat = output_mat4.inverse() * input_mat4; (which would be identity for 'good' runtimes - and I would make an effort to avoid this trivial but expensive calculation in such cases), but projection (proj_mat)? Does FoV, by the way, ever change between the frames?

Jul 08 '25 02:07 C6H7NO2

@C6H7NO2 I had an experimental branch based on this one which altered the depth to match the convergence distance of the user's eyes, so that in theory minor frame drops would scale roughly correctly (ie, what the user is looking at has its scaling stereoscopically correct even if the periphery isn't--yes depth buffers would be better, but SteamVR doesn't provide any, and yes I'm also interested in motion vector interpolation).

In any case, the depth having a 'physicality' was slightly easier for that. It was also partially because I based the math on the visionOS client, which had to use worldspace transformations by necessity for a few reasons. But also I find the worldspace calculations a bit easier to intuit and debug.

Jul 08 '25 03:07 shinyquagsire23

@shinyquagsire23 I guessed it would be something like that... But for 'good' runtimes which perform reprojection themselves (I only have Oculus) this shouldn't make a difference, and having come from programming microcontrollers, I'm freaking out at doing extra calculations when not absolutely necessary... More pertinently, I'm experimenting with adding more stuff in this area, and these calculations make it more difficult.

Have you ever observed FoV changing between calls? If not, things could be simplified significantly.

Also, is there any hidden reason you are doing

let output_mat4 = Mat4::from_translation(view_params.output_view_params.pose.position) * Mat4::from_quat(view_params.output_view_params.pose.orientation);

instead of just using from_rotation_translation()?

Jul 08 '25 06:07 C6H7NO2

@C6H7NO2 view tangents/FoV definitely can change during a session, for instance if a Quest 2 moves the IPD position. The view transforms also can change technically, with eye tracking you'd set the view transforms to the eye's position in the gasket (ideally at least, in practice that's maybe not the best idea with streaming). I'd expect on some headsets in the future, client-side they would be updating view transforms and possibly even tangents live, so you'd have to handle that mismatch since the streamer will be mostly static.

But otherwise a lot of it is just "idk it worked and I was tired of fiddling with it" lol. This path is also just a fallback for bad headset runtimes anyhow, ideally OpenXR handles all this with more optimized routines

Jul 12 '25 21:07 shinyquagsire23