toga Layout root nodes once per loop

Fixes #2938

Prior discussion on #2955; I've followed the design outline suggested by @mhsmith in https://github.com/beeware/toga/pull/2955#issuecomment-2484015978 and @freakboy3742 in https://github.com/beeware/toga/pull/2955#pullrequestreview-2447263067.

I think I have this mostly cracked, but I'm currently stumped.

At the moment I haven't dug into how to update the tests/probes yet — I'm only "testing" by launching toga-demo and seeing what happens. Most of the time it works, and I see this:

Screenshot with correct layout

Log:

Adding SplitContainer to dirty set
Adding OptionContainer to dirty set
Adding ScrollContainer to dirty set
Adding OptionContainer to dirty set
Adding ScrollContainer to dirty set
Adding OptionContainer to dirty set
Adding ScrollContainer to dirty set
Adding SplitContainer to dirty set
Adding SplitContainer to dirty set

Loop(
    #1. Laying out OptionContainer
    #2. Laying out ScrollContainer
    #3. Laying out SplitContainer
        #4. Laying out OptionContainer
        #5. Laying out ScrollContainer
)

Adding OptionContainer to dirty set
Adding ScrollContainer to dirty set
Adding OptionContainer to dirty set
Adding ScrollContainer to dirty set
Adding OptionContainer to dirty set
Adding ScrollContainer to dirty set
Adding OptionContainer to dirty set
Adding ScrollContainer to dirty set
Adding OptionContainer to dirty set
Adding ScrollContainer to dirty set
Adding Table to dirty set
Adding Tree to dirty set
Adding Box to dirty set
Adding Table to dirty set
Adding Tree to dirty set
Adding Box to dirty set

Loop(
    #6. Laying out Box
    #7. Laying out Table
    #8. Laying out ScrollContainer
    #9. Laying out Tree
    #10. Laying out OptionContainer
)

Adding Box to dirty set
Adding Table to dirty set
Adding Tree to dirty set

Loop(
    #11. Laying out Box
    #12. Laying out Table
    #13. Laying out Tree
)

That's less than a quarter of the layout calculations done by the non-deferred behavior, which is 55.

However, sometimes (maybe one in five or six), the SplitContainer is smaller on the left side:

Screenshot with smaller left half of SplitContainer

The log is almost the same:

Adding SplitContainer to dirty set
Adding OptionContainer to dirty set
Adding ScrollContainer to dirty set
Adding OptionContainer to dirty set
Adding ScrollContainer to dirty set
Adding OptionContainer to dirty set
Adding ScrollContainer to dirty set
Adding SplitContainer to dirty set
Adding SplitContainer to dirty set

Loop(
    #1. Laying out OptionContainer
    #2. Laying out ScrollContainer
    #3. Laying out SplitContainer
        #4. Laying out OptionContainer
        #5. Laying out ScrollContainer
)

Adding OptionContainer to dirty set
Adding ScrollContainer to dirty set
Adding OptionContainer to dirty set
Adding ScrollContainer to dirty set

# In the working version, the above pair of additions are generated an additional three times each.

Adding Table to dirty set
Adding Tree to dirty set
Adding Box to dirty set
Adding Table to dirty set
Adding Tree to dirty set
Adding Box to dirty set

Loop(
    #6. Laying out Box
    #7. Laying out Table
    #8. Laying out ScrollContainer
    #9. Laying out Tree
    #10. Laying out OptionContainer
)

Adding Box to dirty set
Adding Table to dirty set
Adding Tree to dirty set

Loop(
    #11. Laying out Box
    #12. Laying out Table
    #13. Laying out Tree
)

All the same layouts are being performed, in the same order. (Interestingly, while popping from the set of dirty widgets is "random", it appears to be deterministically the same every time.) But the OptionContainer and ScrollContainer aren't getting marked dirty as many times. Of course that doesn't matter in and of itself — presumably it's a matter of the events causing those refresh requests. I'm having trouble tracking those down; these ObjCMethod.__call__ and send_message layers in the debugger stack trace are pretty opaque.

What I do know is that all of these refresh requests come, in the immediate sense, fromTogaSplitView.splitViewDidResizeViews_. For the two always-present pairs as well as the first "ephemeral" pair, that's all I can make sense of — the rest is Objective C, and basic event loop stuff. However, for the last two pairs, behind splitViewDidResizeViews_ in the stack is TogaSplitView.applySplit.

So it appears that when things go awry, it's because (somehow) applySplit isn't being called.

PR Checklist:

[ ] All new features have been tested
[ ] All new features have been documented
[x] I have read the CONTRIBUTING.md file
[x] I will abide by the code of conduct

Dec 29 '24 01:12 HalfWhitt

All the same layouts are being performed, in the same order. (Interestingly, while popping from the set of dirty widgets is "random", it appears to be deterministically the same every time.)

That's not entirely surprising; especially for small sets, it's highly likely for the order to end up the same when driven by the same insertion order. That's not something you should rely on, but it's the sort of thing that is likely reliable enough that you think you can rely on it... right up until you can't.

So it appears that when things go awry, it's because (somehow) applySplit isn't being called.

My guess is that this will be related to the exact order of event firing, and in particular, the chain of set_content and set_bounds calls. It's not clear to me from your debug output if you're looked into set_bounds in particular; but historically, that's where this sort of bug will manifest in my experience. If the widget is laid out with a size allocation of 0x0, then the two sub-containers of the split will be set to minimum size, and the split will be set at the smallest viable position. When the "real" widget size is then applied, the previous split position will be preserved, leading to the wrong placement of the split that you're seeing. IIRC, this is why the call to applySplit is a message passed over ObjC, rather than an explicit Python call - it explicitly needs to be deferred so that the "real" widget size can take effect before any split is applied.

Also - is applySplit not being called at all, or is it being invoked as a no-op because _split_location has been cleared? I'm If it's not being called at all, then I wonder if there might be something raising an exception in the set_bounds() preventing the applySplit message from being sent, but then swallowing the exception for some reason.

Another possibility - does introducing a delay into the applySplit call fix things? That's not a good long-term fix, but adding delays like this might help isolate which sequence of events is the source of the problem.

Jan 06 '25 03:01 freakboy3742

Thank you for the insights, those are some good leads to look into. (I'm still learning how the nuts and bolts of the layout mechanism fit together.) Will update with findings once I've prodded some more.

Jan 06 '25 05:01 HalfWhitt

Hey y'all!!!

Right now with this PR I have no idea why but when I merge main in it works perfectly!!! And the log output produced is:

Log

Adding SplitContainer to dirty set
Adding OptionContainer to dirty set
Adding ScrollContainer to dirty set
Adding OptionContainer to dirty set
Adding ScrollContainer to dirty set
Adding OptionContainer to dirty set
Adding ScrollContainer to dirty set
Adding SplitContainer to dirty set
Adding SplitContainer to dirty set
Adding SplitContainer to dirty set

Loop(
    #1. Laying out SplitContainer
        #2. Laying out OptionContainer
            #3. Laying out OptionContainer
                #4. Laying out Table
                #5. Laying out Tree
            #6. Laying out ScrollContainer
                #7. Laying out Box
            #8. Laying out OptionContainer
                #9. Laying out Table
                #10. Laying out Tree
            #11. Laying out ScrollContainer
                #12. Laying out Box
            #13. Laying out OptionContainer
                #14. Laying out Table
                #15. Laying out Tree
            #16. Laying out ScrollContainer
                #17. Laying out Box
            #18. Laying out Table
            #19. Laying out Tree
        #20. Laying out ScrollContainer
            #21. Laying out Box
    #22. Laying out ScrollContainer
        #23. Laying out Box
    #24. Laying out OptionContainer
        #25. Laying out Table
        #26. Laying out Tree
)

Adding OptionContainer to dirty set
Adding ScrollContainer to dirty set
Adding OptionContainer to dirty set
Adding ScrollContainer to dirty set

Loop(
    #27. Laying out ScrollContainer
        #28. Laying out Box
    #29. Laying out OptionContainer
        #30. Laying out Table
        #31. Laying out Tree
)

However when I revert #3793 I'm seeing the same thing as your non-working screenshot every single time, with this log --

Log

Adding SplitContainer to dirty set
Adding OptionContainer to dirty set
Adding ScrollContainer to dirty set
Adding OptionContainer to dirty set
Adding ScrollContainer to dirty set
Adding OptionContainer to dirty set
Adding ScrollContainer to dirty set
Adding SplitContainer to dirty set
Adding SplitContainer to dirty set
Adding SplitContainer to dirty set

Loop(
    #1. Laying out ScrollContainer
    #2. Laying out OptionContainer
    #3. Laying out SplitContainer
        #4. Laying out OptionContainer
        #5. Laying out ScrollContainer
)

Adding OptionContainer to dirty set
Adding ScrollContainer to dirty set
Adding OptionContainer to dirty set
Adding ScrollContainer to dirty set
Adding Box to dirty set
Adding Table to dirty set
Adding Tree to dirty set
Adding Table to dirty set
Adding Tree to dirty set
Adding Box to dirty set

Loop(
    #6. Laying out Table
    #7. Laying out Box
    #8. Laying out Tree
    #9. Laying out ScrollContainer
    #10. Laying out OptionContainer
)

Adding Box to dirty set
Adding Table to dirty set
Adding Tree to dirty set

Loop(
    #11. Laying out Table
    #12. Laying out Box
    #13. Laying out Tree
)

So I think #3793 has unblocked (pun intended because the event loop is being blocked there) progress for this PR!

However I'm not sure if this might be a bit more of a side effect... in that PR I changed the set_bounds code to

    def set_bounds(self, x, y, width, height):
        super().set_bounds(x, y, width, height)

        # Setting the bounds changes the constraints, but that doesn't mean
        # the constraints have been fully applied. Force realization of the
        # new layout, and then refresh the content.
        self.interface.window._impl.native.layoutIfNeeded()
        self.native.refreshContent()

for ScrollContainer and OptionContainer; however, if I comment self.interface.window._impl.native.layoutIfNeeded() out it no longer works properly. But if I comment out that layout if needed line it somehow works properly on main.

My guess is that it always performs extra layout passes before (nearly typed bee-fore but was informed that this got old...) this PR so on the second pass refreshContent always works properly because of fully applied constraints from the first pass. If we could get this claim verified, then I think we should accept this PR sooner because self.interface.window._impl.native.layoutIfNeeded() is easy to miss and this PR will have the side effect of making us catch thesse bugs faster.

Oct 09 '25 22:10 johnzhou721

No... I just accidentally referenced like a huge bunch of issues because I missed a blank line in the details html5 tag and the code got parsed as regular text :facepalm: so things like #1 auto-backlinked.

Is there any way those backlinks from # 1 can be deleted?

Oct 09 '25 22:10 johnzhou721

Right now with this PR I have no idea why but when I merge main in it works perfectly!!! And the log output produced is:

That's awesome that fixing the containers-don't-live-update problem makes it work — I assume you mean that the app looks correct and doesn't open with any weird sizes. The issues I was having were definitely connected to the way Toga's CSS layouts "break" at containers, starting with a new root hierarchy as the content. So it makes sense that your fix would improve matters. Things (at least appear to) ultimately work correctly, and the overall number of layout calculations is dramatically cut down, compared to the existing codebase.

However, based on that log, it seems this PR still isn't working as intended. The intention is that no widget will get laid out more than once per event loop cycle. But in that log, there are lots of repeats — the OptionContainer alone is laid out five times, one of which appears to somehow be a sub-layout of its own layout. Honestly, there's a lot about the OS level of widget management and rehinting that I'm not yet familiar with, and I suspect a large part of the answer lies there.

I do intend to get back to this when I can, but if you feel like tinkering and can figure it out, please be my guest.

Is there any way those backlinks from # 1 can be deleted?

I only see one backlink on that issue... were there multiple? I took the liberty of editing your comment just now to make that not a link, to see if perhaps the backlink would disappear. No luck, unfortunately. Not the end of the world though.

Oct 10 '25 00:10 HalfWhitt

@HalfWhitt

The issue here is that while applying layout, the split location of the split container is changed which invokes the native Objective-C handler which will call refresh on the container contents, however since a refresh is in progress at this point those refresh calls are treated as literal relayouts.

This diff works after merging main into this and resolving a conflict caused by the layout debug mode thing:

diff --git a/cocoa/src/toga_cocoa/widgets/splitcontainer.py b/cocoa/src/toga_cocoa/widgets/splitcontainer.py
index 8ce5296d3..982261072 100644
--- a/cocoa/src/toga_cocoa/widgets/splitcontainer.py
+++ b/cocoa/src/toga_cocoa/widgets/splitcontainer.py
@@ -4,9 +4,12 @@ from travertino.size import at_least
 from toga.constants import Direction
 from toga_cocoa.container import Container
 from toga_cocoa.libs import NSSplitView
+import asyncio
 
 from .base import Widget
 
+async def refresh_interface(container):
+    container.content.interface.refresh()
 
 class TogaSplitView(NSSplitView):
     interface = objc_property(object, weak=True)
@@ -17,7 +20,7 @@ class TogaSplitView(NSSplitView):
         # If the split has moved, a resize of all the content panels is required.
         for container in self.impl.sub_containers:
             if container.content:
-                container.content.interface.refresh()
+                asyncio.create_task(refresh_interface(container))
 
         # Apply any pending split
         self.performSelector(

Note -- in the patch here refreshing interface MUST be done at the Python level else we'll hit the same issue as #3793.

(base) johnzhou@Johns-MacBook-Pro toga % python -m toga_demo
Adding SplitContainer to dirty set
Adding SplitContainer to dirty set
Adding SplitContainer to dirty set
Adding SplitContainer to dirty set

Loop(
    #1. Laying out SplitContainer
        #2. Laying out OptionContainer
            #3. Laying out Table
            #4. Laying out Tree
        #5. Laying out ScrollContainer
            #6. Laying out Box
)

Adding OptionContainer to dirty set
Adding ScrollContainer to dirty set
Adding OptionContainer to dirty set
Adding ScrollContainer to dirty set
Adding OptionContainer to dirty set
Adding ScrollContainer to dirty set
Adding OptionContainer to dirty set
Adding ScrollContainer to dirty set
Adding OptionContainer to dirty set
Adding ScrollContainer to dirty set
Adding OptionContainer to dirty set
Adding ScrollContainer to dirty set

Loop(
    #7. Laying out ScrollContainer
        #8. Laying out Box
    #9. Laying out OptionContainer
        #10. Laying out Table
        #11. Laying out Tree
)

Adding OptionContainer to dirty set
Adding ScrollContainer to dirty set
Adding OptionContainer to dirty set
Adding ScrollContainer to dirty set

Loop(
    #12. Laying out ScrollContainer
        #13. Laying out Box
    #14. Laying out OptionContainer
        #15. Laying out Table
        #16. Laying out Tree
)

I do intend to get back to this when I can, but if you feel like tinkering and can figure it out, please be my guest.

Unfortunately I can't be your guest here since I can't think of anyway to put our service to the test (pun intended) but if someone can come up with a testing plan: be our guest.

EDIT (I got too excited about making the pun and forgot to finish the message) -- some remarks:

I realized that in the pun it isn't obvious that the test part is referring to like testing that these things don't regress. Therefore I apologize for this pun and I should probably make less of those even when there is actual signal in the message.
Instead of refreshing each container in _refresh_layout, would it help to deduplicate the needed-to-be-refreshed containers?
I wrote "since a refresh is in progress at this point those refresh calls are treated as literal relayouts." -- by literal relayouts I mean it's getting captured by line 351 in widget base file. But I have a question -- what's the usecase of that branch anyways? Thanks.

Sorry for being so assertive, I'm getting too excited and perhaps elated or gassy over this.

Oct 10 '25 01:10 johnzhou721

@HalfWhitt

Gtk3 situation (tested by manually uninstalling some testbed icons because they can't load properly on Fedora) is that there seems to already be some widget dirty-marking going on... this PR from my tests does not interfere with that; each container gets refreshed once per event loop (I mean the output is outside of the loop parenthesis because the layout code does not use core stuff, but still). A test where I made one of the buttons keep adding a glut of widgets to the right scroll also seems to cause 1 TOTAL container refresh for all the button insertions.

Winforms has the equivalent situtation. This patch works:

diff --git a/winforms/src/toga_winforms/widgets/splitcontainer.py b/winforms/src/toga_winforms/widgets/splitcontainer.py
index 812ef3aeb..b6998ad1b 100644
--- a/winforms/src/toga_winforms/widgets/splitcontainer.py
+++ b/winforms/src/toga_winforms/widgets/splitcontainer.py
@@ -10,6 +10,8 @@ from ..container import Container
 from ..libs.wrapper import WeakrefCallable
 from .base import Widget

+import asyncio
+

 class SplitContainer(Widget):
     def create(self):
@@ -23,9 +25,12 @@ class SplitContainer(Widget):
         self.panels = (Container(self.native.Panel1), Container(self.native.Panel2))
         self.pending_position = None

-    def winforms_splitter_moved(self, sender, event):
+    async def resize_content_async(self):
         self.resize_content()

+    def winforms_splitter_moved(self, sender, event):
+        asyncio.create_task(self.resize_content_async())
+
     def set_bounds(self, x, y, width, height):
         super().set_bounds(x, y, width, height)

I've pushed all these updates to johnzhou721/toga pr-3063 branch. Now that there's no mobile platforms' SplitContainers this should be completely fixed -- someone work out a testing plan!

@HalfWhitt My questions to you still applies -- I'm still curious about why line 351 of base widgets is there. Becuase if that branch is gone none of these patches would be needed anymore. Thank you!

Oct 10 '25 02:10 johnzhou721

Instead of refreshing each container in _refresh_layout, would it help to deduplicate the needed-to-be-refreshed containers?

I'm not entirely sure — I need to look more into how containers are actually implemented.

I wrote "since a refresh is in progress at this point those refresh calls are treated as literal relayouts." -- by literal relayouts I mean it's getting captured by line 351 in widget base file. But I have a question -- what's the usecase of that branch anyways? Thanks.

That's my mistake — I believe it's a holdover from a different approach I was trying.

Having looked back over this, okay, I do see how widgets are being added to the dirty set mid-loop. I remember now that I tried enforcing exactly one layout per root node per event loop, and doing a tree search so that it always goes from the highest/outermost root on down. Unfortunately, the interactions between containers and their contents go both ways; a container determines what size viewport its content root(s) get, but the content roots determine the container's minimum width and height. Therefore it's probably impossible to get a correct layout while laying out each root only once.

I'm not sure how to test this either... worst-case scenario, just making sure it doesn't break anything could be good enough if there's no good way to check what precisely is going on in the event loop when. (Obviously it's currently breaking things, at least in terms of tests.)

You've successfully gotten me thinking about this again, John, good job 😂

Oct 11 '25 04:10 HalfWhitt

@HalfWhitt So... you're welcome?

I see at widget base +362 you have

        # Is this needed here? Things seem to work fine without it.
        # self._impl.refresh()

So yes, this is needed. _impl.refresh is what does the "rehinting" which updates the preferred size of the widget.

Having looked back over this, okay, I do see how widgets are being added to the dirty set mid-loop. I remember now that I tried enforcing exactly one layout per root node per event loop, and doing a tree search so that it always goes from the highest/outermost root on down. Unfortunately, the interactions between containers and their contents go both ways; a container determines what size viewport its content root(s) get, but the content roots determine the container's minimum width and height. Therefore it's probably impossible to get a correct layout while laying out each root only once.

Yeah, I understand that, I'll have to think about it more tomorrow since it's getting late.

I'm not sure how to test this either... worst-case scenario, just making sure it doesn't break anything could be good enough if there's no good way to check what precisely is going on in the event loop when. (Obviously it's currently breaking things, at least in terms of tests.)

We aren't actually breaking anything; the tests needs to be updated -- I'm not so sure how though. We might introduce a tiny delay before asserting we've refreshed, and that might work; but then, all those tests aren't async. Are there any way to make the asyncio event loop run a single iteration to make sure the actions are performed within a single iteration?

I'd need guidance from, say, freakboy3742 on how to update the tests the best.

EDIT -- hold up. The testing issue might be tied to what I said here on the top of this message:

I see at widget base +362 you have

        # Is this needed here? Things seem to work fine without it.
        # self._impl.refresh()

Oct 11 '25 04:10 johnzhou721

I understand what it is (generally), I was just curious if it's somehow getting called elsewhere, because when I remove it, the demo, at least, still works fine.

Also the failures happen whether it's there or not. I'll try and take a look at it later.

Oct 11 '25 05:10 HalfWhitt