wgpu
wgpu copied to clipboard
[metal] Improve layer initialization and resizing
Description
Overriding the layer on NSView, as is currently done, makes the view "layer-hosting", see wantsLayer.
This prevents Winit from emitting the RedrawRequested event at the desired frame, because overwriting the layer implicitly disables the drawRect:/updateLayer mechanisms that we need to use. Instead, Winit is currently forced to emit the event after all other events have been processed, which is hacky, and does not integrate well with redraws.
In this PR, I've changed it so that wgpu, when it is given a view without a CAMetalLayer, instead creates a new sublayer that it renders into (as is already done on iOS).
Note that this might theoretically have a slight performance impact, since the system now has to do more state-tracking, but I suspect it's miniscule (wasn't able to measure it with bunnymark), and I'd argue that it's a small price to pay for correctness.
I guess that another solution would be to properly implement the CALayerDelegate, and call drawRect:/updateLayer in there ourselves - though that's a can of worms I'm not really prepared to open.
Connections
Fixes https://github.com/gfx-rs/wgpu/issues/1168, mostly by using kCAGravityResize, but the situation will also be properly fixed in the future, since Winit can clean up its current redrawing hacks.
Related similar issues: https://github.com/gfx-rs/wgpu/issues/249, https://github.com/gfx-rs/wgpu-rs/issues/536, https://github.com/rust-windowing/glutin/issues/1340, https://github.com/rust-windowing/winit/pull/1901 and https://github.com/rust-windowing/winit/issues/2640.
Also makes the situation better for https://github.com/gfx-rs/wgpu/issues/3756, by enabling present_with_transaction, resizing becomes perfectly smooth.
Finally, it removes our dependence on having access to NSView, paving the way for a future where raw-window-handle may only provide CALayer.
Testing
See this repo which contains roughly the "minimal" setup needed to get wgpu working on macOS and iOS without Winit. The crate renders a triangle whose top corner is at a fixed location from the right edge of the window, which is useful for testing that redrawing works as it should. It also has a few different features that you can try out to see how things look using some of the different methods for rendering that macOS/iOS provides. Use the patch key in Cargo.toml to test it with and without this PR.
I've also tested this with Winit on macOS and Mac Catalyst, by resizing the hello_triangle window, and moving it between an external monitors.
Current behaviour:
https://github.com/user-attachments/assets/2cf9b55c-ea1f-4b9e-9e90-34c70b7176fc
With this PR:
https://github.com/user-attachments/assets/2588ce7e-375b-4fb9-b8f9-3caccbe8138a
Checklist
- [x] Run
cargo fmt. - [x] Run
cargo clippy. If applicable, add:- [x]
--target wasm32-unknown-unknown - [x]
--target wasm32-unknown-emscripten
- [x]
- [x] Run
cargo xtask testto run tests. - [x] Add change to
CHANGELOG.md. See simple instructions inside file.
TODO (though can still be reviewed in the meantime)
- [ ] Test drawing in non-fullwindow views (and maybe scaled / transformed / rotated / weird views?).
- [ ] Test passing a MTKView to
wgpu. - [x] Test with SDL and other such "window providers" in the Rust ecosystem? I'll need some help with this!
sdl2's example seems to work fine (they're using a custom view with aCAMetalLayeras the root layer, so shouldn't be affected by most of this).
Another solution than this would be to make wgpu handle when to render internally (e.g. provide a "render" callback). Not sure how that'd work on other platforms though, which is why I've gone this route instead.
I'm a bit short on time right now, but I'll try to have a deeper look later this week! Looks really well researched and documented, looking forward to it :)
So, I've run into a macOS bug that has been driving me insane, because it took me so long to figure out that it's actually that, an OS bug, and not because I was doing something wrong. It happens when setting an autoresizingMask to something that auto-sizes (like we do with kCALayerWidthSizable | kCALayerHeightSizable), using a contentsGravity that auto-resizes (like we do with kCAGravityResize), and using auto layout on the owning view. I believe that it is an OS bug because I can reproduce it in a virtual machine on macOS 14.6 and on macOS 15 beta 3, but not on macOS 13.5.
I am fairly sure that what's happening is that AppKit's auto layout algorithm, when calculating the size on the containing NSView, sets the same size on the inner CALayer, and this size happens to contain a fractional value on high DPI screens (on macOS 13 and below, the auto layout algorithm did not create fractional view sizes, so it is not a problem there). Layers do not like fractional values in their width/height (documented here, though unknown why this is the case), and so the auto-resize mechanism ends up rounding these values to not be fractional. This, in turn, causes there to be a difference between the layer's bounds, and the superlayer's bounds, and kCAGravityResize interprets it as-if we we want it to maintain this difference, and then things blow up.
This is a problem when using Wgpu inside more complicated layouts, since the bounds of the layer will slowly drift out of sync with reality the more you resize, as can be seen in this repo by using the "two-triangles" Cargo feature, or in the video below:
https://github.com/user-attachments/assets/e13a67db-ab21-41ab-b7cd-367acb12edb8
I would argue this is not a blocker for this PR, as it still improves things so far, but it's something that I'll try to work on resolving in any case.
Hm, it looks like I was less familiar with the AppKit view and layer management than I thought.
By not making it layer-hosting, does that mean that the layer should be only updated in the drawRect:/updateLayer?
I'm just slightly concerned that there is no undue restrictions when doing it like that. I'm thinking of maybe a case of using another thread to do nextDrawable in a custom loop, to do some fancy framepacing tricks (or maybe slow rendering without interfering other UI). Its just a bit unclear from the docs that what is required or allowed for layer-backed vs layer-hosted in the particular case of CAMetalLayer, though I bet it skips most of AppKit requirements.
I'm fairly sure I'm just paranoid but wanted to air out my concerns.
By not making it layer-hosting, does that mean that the layer should be only updated in the
drawRect:/updateLayer?
No no, layers can be updated safely anywhere, from any thread, regardless of whether the view they're contained in is layer-hosting or just layer-backed. Only issue is that you will get tearing and such unless you use Wgpu's present_with_transaction, or restrict your drawing inside drawRect:/updateLayer.
The difference between layer-hosting and layer-backed is only in terms of how the NSView talks to its CALayer - and with layer-hosting views, NSView functionality that Winit uses (i.e. drawRect:/updateLayer) is disabled, which is the primary motivation for this PR's changes.
Does that make sense?
I've toiled further away on this, and I I've come to the conclusion that the only proper way to fix the issue I talked about in https://github.com/gfx-rs/wgpu/pull/6107#issuecomment-2319599762 is to use observers, that is, make the new layer that we create observe the contentsScale and bounds of the super layer, and update it's properties when those change.
You can see implementations of this idea https://github.com/rust-windowing/raw-window-metal/pull/19 and https://github.com/rust-windowing/softbuffer/pull/234, it's fairly tricky stuff since the Key-Value-Observing APIs are just very terrible outside of Objective-C.
I then tried to actually implement this in Wgpu, but since it's still using the objc crate, I had to resort to all sorts of hacks to get it to work. In the end, I threw the code away; it's doable, but it's not pretty, and I would rather have Wgpu depend on raw-window-metal so that we can maintain this logic in a central place. That's a discussion for a separate PR though, I think this PR is good to go now.
sounds very reasonable to me. Thanks again for documenting everything so well both in comments as well as the thought process here on the PR!
@madsmtm Thank you so much for the comments in wgpu_hal::metal::Surface::get_metal_layer! They're very helpful.
Not sure if anyone has noticed, yet for me it's quite obvious even on the video demonstration provided in the PR that after the fix the resizing is more jittery and laggy. It does fix the white blanks at the edges, yet introduces some delay.
I've spent about an hour just to find what was causing this undesirable behavior after updating dependencies in my personal project. I think that it's important to emphasize that the laggy resizing introduced by this PR is not what some users might want.
Not sure if anyone has noticed, yet for me it's quite obvious even on the video demonstration provided in the PR that after the fix the resizing is more jittery and laggy. It does fix the white blanks at the edges, yet introduces some delay.
I've spent about an hour just to find what was causing this undesirable behavior after updating dependencies in my personal project. I think that it's important to emphasize that the laggy resizing introduced by this PR is not what some users might want.
Hmm, that's probably because we're using kCAGravityResize, which might be forcing the system to do more buffering or something? I'm unsure!
Could I get you to try https://github.com/gfx-rs/wgpu/pull/6210? EDIT: Nvmd, we raced.
And separately, also try the following diff:
diff --git a/wgpu-hal/src/metal/surface.rs b/wgpu-hal/src/metal/surface.rs
index b35c73c91..5f0fda802 100644
--- a/wgpu-hal/src/metal/surface.rs
+++ b/wgpu-hal/src/metal/surface.rs
@@ -23,7 +23,7 @@ use parking_lot::{Mutex, RwLock};
#[link(name = "QuartzCore", kind = "framework")]
extern "C" {
#[allow(non_upper_case_globals)]
- static kCAGravityResize: *mut Object;
+ static kCAGravityTopLeft: *mut Object;
}
extern "C" fn layer_should_inherit_contents_scale_from_window(
@@ -243,7 +243,7 @@ impl super::Surface {
// Unfortunately, it also makes it harder to see changes to
// `width` and `height` in `configure`. When debugging resize
// issues, swap this for `kCAGravityTopLeft` instead.
- let _: () = msg_send![new_layer, setContentsGravity: unsafe { kCAGravityResize }];
+ let _: () = msg_send![new_layer, setContentsGravity: unsafe { kCAGravityTopLeft }];
// Set initial scale factor of the layer. This is kept in sync by
// `configure` (on UIKit), and the delegate below (on AppKit).
After applying the patch the result seems worse. Noticeable lag remains and now the window sometimes flashes at the top.
https://github.com/user-attachments/assets/1b6b9971-a7ea-4e0b-a08a-7e0771df5b94
After applying the patch the result seems worse
Huh!
Could you try with the following (should effectively reverse this PR), possibly in combination with the diff before:
diff --git a/wgpu-hal/src/metal/surface.rs b/wgpu-hal/src/metal/surface.rs
index b35c73c91..854b6deff 100644
--- a/wgpu-hal/src/metal/surface.rs
+++ b/wgpu-hal/src/metal/surface.rs
@@ -204,10 +204,10 @@ impl super::Surface {
// Create a new sublayer.
let new_layer: *mut Object = msg_send![class!(CAMetalLayer), new];
- let () = msg_send![root_layer, addSublayer: new_layer];
+ let () = msg_send![view.as_ptr(), setLayer: new_layer];
#[cfg(target_os = "macos")]
- {
+ if false {
// Automatically resize the sublayer's frame to match the
// superlayer's bounds.
//
for me it's quite obvious even on the video demonstration
Still not entirely sure I see the issue btw, could you explain further what you mean? As in, how would I verify whether the issue is present or not? The distance between the window corner and the cursor?
Wow, this seems like the perfect solution. Applying this patch atop of master gave the best results: no delay/lag when resizing and also no blanks/artifacts. However, when applied with the previous patch you sent the window has some artifacts at the top when resizing.
To determine whether the problem is still present, pay close attention to the triangle edges, or the window border. In the case of videos in the PR, it looks like the window itself is being resized at a "lower framerate". As for the triangle, its edges are updated to match the new size at a slower rate than in the other video.
To reiterate, applying the most recent patch atop of master fixes everything.
Okay, so I think I see what you mean (and I think the effect is more visible when you actually do it, than in the videos).
But I couldn't see the effect when using https://github.com/gfx-rs/wgpu/pull/6210?
Tested with cargo run --bin wgpu-examples -- hello_triangle (no optimizations, though I found the same when enabling optimizations).
Current trunk (3016c56db674fb39e79327b52e85105c1020e766):
https://github.com/user-attachments/assets/a615dca4-5422-49f2-9d90-479e506a8c65
With this patch applied:
https://github.com/user-attachments/assets/44f02b92-4685-4195-8bf3-3444e0c7c511
After https://github.com/gfx-rs/wgpu/pull/6210:
https://github.com/user-attachments/assets/c214394a-9206-4ab8-aafa-ac262987cb19
After testing it again with cargo run --bin wgpu-examples -- hello_triangle I concluded that patched and https://github.com/gfx-rs/wgpu/pull/6210 looks almost exactly the same. The main difference arises when the app constantly draws to the screen.
I've also tested it in my application which schedules redraws according to DisplayLink and https://github.com/gfx-rs/wgpu/pull/6210 appears to have undesirable lag when resizing, yet after the patch https://github.com/gfx-rs/wgpu/pull/6107#issuecomment-2499760574 applied to master it works flawlessly.
I'm not an expert in terms of how the rendering should work on macOS. However, through a lot of testing it seems that the Display Link approach minimizes the latency for applications that must redraw in sync with the display refresh rate, such as games. There may be a better approach available that I'm not aware of.
You are right that Display Link is the "best" approach to rendering on macOS (with probably a bunch of caveats there), but it's not really a viable approach for Wgpu + Winit because it's not cross-platform.
I'd be interested in seeing your code, I really would've thought that #6210 wouldn't have such issues - unless the issue just is that it's more expensive to register that observer than I thought, and the cost ends up causing delays in your rendering code? I'd be interested in seeing a flamegraph of the two!
@madsmtm, I've pushed the code to https://github.com/polina4096/wgpu-pr6107-demo. There are two binaries, one is using the patched (https://github.com/gfx-rs/wgpu/pull/6107#issuecomment-2499760574) trunk, other is using #6210. For both, the rendering is scheduled using Display Link.