breezy-desktop
breezy-desktop copied to clipboard
Avoid minor movements of screen in XR glasses when small head movements
When wearing the glasses (Viture One) the projected monitor constantly moves by minor amounts, even if I keep my head still. This is due to minor muscle movements and even pulsating of the skin due to my heartbeat. This is distracting and strainful on my eyes, because the image gets slightly blurry when the glasses move it on the screen to compensate for that minor involuntary movements. It would be good if the projected image would remain fixed unless I move the head more than a predefined threshold.
I would like to try implementing this by myself but I have no experience with this and do not know where to start. Is this something to be implemented in the driver or the GNOME Extension? Could someone point me to a location where to start?
Filtering out small movements can produce a pretty bad drift, since all movements are important when it comes to understanding your current pose. What I typically recommend is to look into the Movement Look-ahead slider in the advanced settings.
This feature attempts to predict where your head will be based on your current velocity because it takes at least 10-20ms or sometimes more to render the frame based on your last pose, so my driver has to try to predict where you'll be that far ahead to make it appear that the screen is fixed. But the result of this prediction is that it amplifies movements, so when the prediction is wrong like in the case of shaking, it becomes pretty obvious.
Try putting the slider to the second notch (the lowest value above default) and you should see the least amount of shaking, but also a less responsive screen. Move the slider up until you find a balance that is okay for you. Let me know how it goes.
Hello, thank you for your suggestion. Alas, the setting with either value does not help. E.g. when I am typing these words, each keystroke causes the image to become blurry, caused by movement of the image on the glasses' screen. My problem is not to keep the virtual monitor fixed in a position in space. My problem arises when the physical position of the screen on the glasses changes, which causes blur and slight stuttering. For example, I have an easier time to write this text in 0dof mode because in any other mode (either with breezy-desktop or with the native 2dof-mode (head tilt is ignored) each movement of the image on the glasses' screen causes blur. So for each keystroke the vision becomes temporarily blurry.
i don't mind it if it moves in virtual space as long as it does not move on the projected area. Drift could be accountet for in a similar way to the follow mode. For small movements, there would be no attempt to correct the screen location but above a certain threshold it could then move back to its original location.
I do not know if this is enough for me to be able to use this setup productively, that's why I wanted to modify the code to see if it improves things.
I'm not sure if this is related but I'm seeing that my screen drifts from time to time I'm trying to use the Viture XR Pros while coding and I find myself looking at a certain portion of the screen for a while and then notice that the whole screen has shifted. I find myself re-centering quite often
I'm using GNOME 46 on Ubuntu 24 I have follow mode off
PS Thank you for building this. I always wanted to try AR glasses as a productivity tool and Breezy is making it possible. So far I'm enjoying using it and with a bit more improvement will be a game changer!
I've noticed this issue with my pulse pushing the glasses around. I think it'd be possible to fix with the right filter, and without drift by separating the pose estimation into two parts:
- The pose of the glasses in 3D space (rotation matrix
R) - The small rotation between the glasses and the skull (rotation matrix
S)
The rendering currently uses R, but to remove periodic motions due to the wearer's pulse in the skin on the nose, it should use R*inv(S) (or inv(S)*R depending on convention)
We can estimate R using whatever low-drift filter just as you currently do.
The trick, then, is to somehow estimate S. I expect some kind of band-pass filter in the 1-2Hz range could work for this, with update S, state = filter(R, state), keeping some amount of state in the filter?
It seems likely this scheme would have some tradeoffs because estimating S vs R is underdetermined. But I think it's worth trying. If you can point me at the right piece of code I can try make some time to look at this :)
Using the Viture XR Pro as well. I can actually 'see' my heartbeat based on the screen shaking/jittering slightly each time. To be clear: I'm not saying that to exaggerate -- it corresponds exactly to my actual heart rate/beats. The legs of the glasses are relatively tight on my ears.
Adjusting the look-ahead slider doesn't really change much between 'default' and '40ms.' Adjusting the glasses to be slightly higher (above my ears) changes the movement to be mostly vertical rather than horizontal, which helps keep text readable. I agree that some kind of filtering would be helpful for reducing strain.
I just picked up a pair of Viture Pros after using the xreal airs, and while the experience is overall WAY better; virtually no drift!; I am also running into a similar effect that @phooji mentioned.
I was just thinking, if filtering introduces drift, what about smoothing?
I wonder if an approach where the smallest magnitude movements are filtered out on a frame-by-frame level, but not discarded.
Instead, add this difference between the filtered and un-filtered value to a vector that is maintained between frames. This is the amount of error the filtering has introduced.
At the same time, when calculating the pose during frame rendering, you can transfer some portion of this error vector into the calculated head pose relative to the magnitude of the change in head pose. (reasoning being that the faster the movement, the easier it would be to smuggle in some of this error correction without being obvious to the wearer)
What I am thinking, and would be interesting to see tested, is that these sorts of micro-movements from heartbeat or small tremors are likely to be self-canceling over some period of time; heartbeat rises, heartbeat falls.
So by maintaining this error vector, and slowly correcting for drift over a time period longer than a single frame, there is an opportunity for the micro-movements to cancel each other out, resulting in less error that needs to be corrected for.
For reference if anyone wants to try to contribute one of the fixes being discussed here, this function is where all IMU quaternion values are handled by the driver, directly from the device integrations.
Note that this function pushes the quaternions into a buffer that allows it to track quaternion values from about 10ms and 20ms ago, so it can use the 3 snapshots to compute a velocity and do a look-ahead adjustment. Because of the timing of the VITURE IMU data, the buffer stages actually represent about 17 ms (~60Hz), so the 3 snapshots are roughly at 0ms, 17ms, and 33ms. You could change the VITURE buffer size (in viture.c) to allow for more timely snapshots for glasses that support 240Hz (the pros do), just be aware that doing so would also make the look-ahead more vulnerable to noise (possibly shakier).
Interesting discussion. I notice that I filed my bug as a partial duplicate of this as I only focused on the heartbeat. The reason I did that is that I believe that part is reasonable easy to predict and cancel. I'd be happy to help as AI scientist. I am however only skilled in python... c422f seems to know the signal processing better than I do :-)
Did I understand it right if I think the glasses send out relative yaw, pitch and roll? three streamed data channels. It would be tempting to just regularly run a fourier of the last 5 seconds of each stream, find the largest 0.5 - 3 Hz component in them each that hopefully is the pulse, and remove it.
It would be tempting to just regularly run a fourier of the last 5 seconds of each stream, find the largest 0.5 - 3 Hz component in them each that hopefully is the pulse, and remove it
Yes, this is basically what I have in mind. You can do it with a fourier transform, though I guess a DSP-style band stop filter would be a lot more computationally efficient and as far as I understand those would be more amenable to stream-based processing rather than redoing the whole windowed fourier transform at each time step.
I was just thinking, if filtering introduces drift, what about smoothing?
I wonder if an approach where the smallest magnitude movements are filtered out on a frame-by-frame level, but not discarded.
Instead, add this difference between the filtered and un-filtered value to a vector that is maintained between frames. This is the amount of error the filtering has introduced.
This is more or less what I'm proposing. The question is "how to design the 'smoothing' filter correctly" the answer probably is that you want a band-stop filter to reject the frequencies only in the 1-Hz ish range of a heart beat. Any smoothing is going to have the tradeoff that it introduces some amount of lag to the head tracking so we'd just have to try it and see whether that's a good tradeoff. (In practice, this might be equivalent to a low pass smoothing filter, as higher frequencies are unlikely to be hugely important for head motions in non-gaming applications?? But it may not. just need to try it)
I've also noticed quite a lot more drift when shaking the glasses at higher frequency due to head movements - I do wonder whether there's some coning/skulling integral correction which is missing from the firmware but if so we can't do much about that, except to increase the sample rate if possible (I should try that, I do have the viture pros)
You could change the VITURE buffer size (in viture.c) to allow for more timely snapshots for glasses that support 240Hz (the pros do), just be aware that doing so would also make the look-ahead more vulnerable to noise (possibly shakier)
Ok, this is really useful and interesting, thanks.
I guess it won't be shakier, but only if the correct look ahead is computed based on averaging multiple angular velocity samples from the buffer (using a single sample would definitely be bad)
I'm having a bit of a look at the code. Couple of questions:
- What's the correct way to dump raw IMU values to a file for offline experimentation with filtering strategies? I can see the driver writes to shared memory inside
handle_imu_update()and following a few function calls I think this eventually ends up in/dev/shm/breezy_desktop_imuand is picked up by breezy desktop javascript code here https://github.com/wheaney/breezy-desktop/blob/dbd1b88e18f6d36ec30f6ac9bfab7d000aa48973/gnome/src/devicedatastream.js#L280 ?? So I guess I can just read the stream from the shared memory in/dev/shm/breezy_desktop_imuwhich is great, though I'm not sure how to poll this file with the right frequency? - Where is the look ahead adjustment actually computed? I'm assuming it's in the driver rather than breezy desktop but I didn't see exactly where yet. Would this be the right place to add a more sophisticated filter?
What's the correct way to dump raw IMU values
For VITURE specifically, this is the callback function triggered by their SDK when a sample is ready. It's the closest you'll get to the raw data, but keep in mind that VITURE is actually doing the sensor fusion stuff on-device, so these "raw" values aren't actually the raw sensor values like you'll see from other devices.
Where is the look ahead adjustment actually computed?
It's done in the shader (different shaders depending on whether you're looking for Breezy Vulkan/XR Gaming or for Breezy GNOME) by taking the most recent 2 IMU snapshots, applying them both to the vector for the current pixel, computing a rate of change between them, and then applying that velocity to the latest snapshot for the number of milliseconds we want to predict ahead.