[BUG] Incorrect usage of mutex.unlock is causing deadlock
Godot version 4.4.2-mono (The only thing we changed for templates is our pack encryption key)
Plugin version 1.4.1
Platform Windows 11
Issue description This code keeps getting executed in a _process function:
https://github.com/dreadpon/godot_spatial_gardener/blob/77b822f50ded7211f91c0f0588d7b7c8bedac5f6/addons/dreadpon.spatial_gardener/arborist/arborist.gd#L716-L719
This means we might have thousands of calls to mutex_octree.unlock() while the mutex is not even locked!
The godot docs clearly state that this might cause deadlocks or crashes:
So to be able to find out if the hanging issue in my friend's project is actually fault of these calls to unlock, I replaced all of Mutex.new instances to MyCoolMutex.new (yes, I know this is wrong, but just for testing). Here is the code:
class MyCoolMutex:
var is_locked: bool = false
var _my_m: Mutex = null
func _init() -> void:
self._my_m = Mutex.new()
self.is_locked = false
func lock():
if self.is_locked:
print("OH NO WE ARE ALREADY LOCKED")
return
self.is_locked = true
self._my_m.lock()
func unlock():
if not self.is_locked:
print("NOT LOCKED AT ALL")
return
self.is_locked = false
self._my_m.unlock()
Then made sure all of them are using this:
var mutex_octree: MyCoolMutex = MyCoolMutex.new()
# Mutex that handles working with any meta variables relating to LOD updates
var mutex_LOD_update_meta:MyCoolMutex = MyCoolMutex.new()
func _init():
set_meta("class", "Arborist")
resource_local_to_scene = true
logger = Logger.get_for(self)
mutex_octree = MyCoolMutex.new()
mutex_LOD_update_meta = MyCoolMutex.new()
what did I observe?
Please note the return statement at the end of unlock, so this version of code does not hang, because it simply returns from the function and the mutex is not really being unlocked lots of times at all.
Now what if I comment out these three lines in MyCoolMutex?
func unlock():
#if not self.is_locked:
#print("NOT LOCKED AT ALL")
#return
self.is_locked = false
self._my_m.unlock()
https://github.com/user-attachments/assets/910f5852-1f6a-42df-9b9c-3af99d7ba731
Yup, that's a deadlock, on this line:
https://github.com/dreadpon/godot_spatial_gardener/blob/77b822f50ded7211f91c0f0588d7b7c8bedac5f6/addons/dreadpon.spatial_gardener/arborist/arborist.gd#L110-L115
Which will cause the whole game to hang, and when running inside of the editor, the moment we add gardener node to our scene...it will just hang.
NOTE: This issue seems like os-dependent (or maybe it depends on some other factors?), because my friend on linux cannot reproduce this issue at all and he says everything is working fine for him...so I guess Mutex.unlock behavior is different on different operating systems and sometimes it might not cause this issue...not sure
Minimal reproduction project
NOTE: The reproduction project might not result in a deadlock for everyone...I'm not sure if we can make that happen...but all I know is that calling unlock like this is wrong...
Proposed solution: To be honest I would prefer to not use mutexes like this...but one solution would be to convert this code to this:
func request_debug_redraw():
if debug_redraw_requested_managers.is_empty():
mutex_octree.try_lock() # adding this line would solve the issue
mutex_octree.unlock()
return
var requested_indexes := []
# rest of the code here...
Yes, this will fix the issue, at least on the sample project I provided. At least this is the temporary solution I came up with.