Prevent infinite recursion in first `_draw`
Caused by size changes fired in some cases, mainly by auto-translation, and due to that _redraw_callback is called from a flush causing any queued methods to be fired immediately.
Added pending_update = false; to the _exit_canvas method to ensure things behave reasonably in some edge cases like changing the world_2d or reparenting self during _draw which would otherwise never fire a redraw. This is not really reasonable use but I feel it's good to not introduce weird behavior in these cases.
The underlying cause here seems to be that when entering the tree a bunch of different calls are made including several that unconditionally resize the node. These are likely inefficient and could be improved but this offers a fix that ensures things behave reasonably in these cases.
So I'd say this is a functional fix, we can look at reducing unnecessary calls to queue_redraw and resizing calls on entering the tree etc. (it happens in things like NOTIFY_TRANSLATION_CHANGED and a bunch of others) but this should ensure things run smoothly.
MRP for testing: recursive_draw.zip
- Fixes: https://github.com/godotengine/godot/issues/95810
- Fixes: https://github.com/godotengine/godot/issues/95709
What seems to be happening here is that:
Nodereceives theNOTIFICATION_ENTER_TREEnotification, fires theNOTIFICATION_TRANSLATION_CHANGEDnotification, which queues a draw (and size changes as well, which might or might not be relevant to that part)CanvasItemreceives theNOTIFICATION_ENTER_TREEnotification, calls_enter_canvaswhich clearspending_updatewhich forces a second draw- When the message queue is flushed the two draw calls fire back-to-back
- The child node is added, this fires resizing in the child, which in turn fires resizing in the parent, when this happens in the second draw call
pending_updateis once againfalseas it was cleared at the end of the first draw, this allows a new draw to be queued immediately
If you remove the add_child call from the MRP and run it without this PR you will notice it draws twice, when it should only draw once, this removes that double call as well (which is the cause of the bug)
Unsure if this can be triggered in 4.2 but adding a cherry pick for it as the fix is limited in scope
Unsure if there's any weird behavior triggered by this same code in 3.x (it was added back before Godot was open source) but can make a dedicated fix for it if desired to eliminate any potential issues there too (don't have a 3.x setup to test myself right now)
The simplicity of this fix has me wonder if doing it this way has any unforeseen consequences.
I've dug through and tested various cases, the pending_update clearing in _enter_canvas only matters in the narrow edge case of when canvas operations happen during a draw call, as it will always be cleared otherwise, to be entirely safe (with the cost of losing the case of changing the world or re-parenting the node during a draw call) I could remove it in _exit_canvas, but there simply isn't any need for the pending_update clearing in _enter_canvas and it becoming a problem was really only revealed by the translation changes, it was always a risk just not caused by any existing condition (that I know of, there could be cases that cause this in older versions just haven't found any)
What is current status of this PR ?
It's awaiting review, it works well but needs some decision from the core team
The PR itself makes sense to me. But I am a little skeptical seeing as the behaviour changed between 4.2 and 4.3, but the code this PR changes hasn't been touched in many years.
I suspect that it would be best to solve this bug in the spot that introduced the regression otherwise we will be risking breaking other untested code paths in subtle ways
I agree, will look at pinpointing what changed and see if any changes to this fix are needed
I have an alternative solution and will push momentarily, will keep the current fix for a potential follow-up fix if we can confirm the changes are safe to future proof this code
Re-confirming this still fixes the linked bugs but this should be a conservative fix still resolving the problem
Edit: Confirmed, they all now work correctly
Great work tracking that down! This indeed seems like a much safer change to make. Pinging @YeldhamDev to confirm that the translation stuff still looks good
Thanks!
Thank you!