Deadlock in NSUserDefaults
Here's an annotated stack dump:
(gdb) thread apply all bt
Thread 2 (Thread 0x7f1c5767d700 (LWP 2242095)):
#0 0x00007f1c5c489170 in __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
#1 0x00007f1c5c481131 in pthread_mutex_lock () from /lib/x86_64-linux-gnu/libpthread.so.0
We are now trying to acquire classLock
#2 0x00007f1c5a7ae3fe in -[NSRecursiveLock lock] (self=0x1fff488, _cmd=0x7f1c5aae0cb0 <objc_selector_list+592>) at NSLock.m:422
#3 0x00007f1c5a83e724 in +[NSUserDefaults standardUserDefaults] (self=0x1fa67e0, _cmd=<optimized out>) at NSUserDefaults.m:830
#4 0x00007f1c5a8249c4 in +[NSTimeZone(Private) _notified:] (self=0x2020fc0, _cmd=<optimized out>, n=<optimized out>) at NSTimeZone.m:2454
#5 0x00007f1c5a7b6135 in -[NSNotificationCenter _postAndRelease:] (self=<optimized out>, _cmd=<optimized out>, notification=<optimized out>) at NSNotificationCenter.m:1198
We now hold _lock (acquired at NSUserDefaults.m:2311)
#6 0x00007f1c5a84269c in -[NSUserDefaults(Private) _changePersistentDomain:] (self=0x23e1808, _cmd=<optimized out>, domainName=0x25ec038) at NSUserDefaults.m:2328
At this point this thread holds _lock (acquired at NSUserDefaults.m:1800)
#7 0x00007f1c5a84150b in -[NSUserDefaults synchronize] (self=0x23e1808, _cmd=<optimized out>) at NSUserDefaults.m:1915
#8 0x00007f1c5a83e97a in +[NSUserDefaults standardUserDefaults] (self=0x1fa67e0, _cmd=<optimized out>) at NSUserDefaults.m:927
#9 0x00007f1c5a842130 in GSPrivateDefaultsFlag (type=GSLogThread) at NSUserDefaults.m:2162
#10 0x00007f1c5a7b03b9 in NSLogv (format=0x7f1c5b0e2380 <objc_str>, args=0x7f1c5767c960) at NSLog.m:356
#11 0x00007f1c5a7b0378 in NSLog (format=0x1fff490) at NSLog.m:299
#12 0x00007f1c5afa0b4f in -[UKGenericEventLoop run] (self=0x250d748, _cmd=0x7f1c5b0e2aa0 <objc_selector_list+336>) at UKGenericEventLoop.m:188
#13 0x00007f1c5a81dbec in nsthreadLauncher (thread=0x20514a8) at NSThread.m:1351
#14 0x00007f1c5c47e609 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#15 0x00007f1c5a1de533 in clone () from /lib/x86_64-linux-gnu/libc.so.6
Thread 1 (Thread 0x7f1c5b8172c0 (LWP 2242026)):
#0 0x00007f1c5c489170 in __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
#1 0x00007f1c5c481131 in pthread_mutex_lock () from /lib/x86_64-linux-gnu/libpthread.so.0
#2 0x00007f1c5a7ae3fe in -[NSRecursiveLock lock] (self=0x2562708, _cmd=0x7f1c5aae0cb0 <objc_selector_list+592>) at NSLock.m:422
Trying to acquire _lock
#3 0x00007f1c5a841172 in -[NSUserDefaults synchronize] (self=0x23e1808, _cmd=<optimized out>) at NSUserDefaults.m:1800
We now hold classLock (acquired at NSUserDefaults.m:666)
#4 0x00007f1c5a83dfe2 in +[NSUserDefaults resetStandardUserDefaults] (self=0x1fa67e0, _cmd=<optimized out>) at NSUserDefaults.m:687
We are now holding gnustep_global_lock (acquired at NSPathUtilities.m:1611)
#5 0x00007f1c5a7caf8b in GSSetUserName (aName=<optimized out>) at NSPathUtilities.m:1620
#6 0x00007f1c5b4d97bc in -[AkamaiDaemon switchUidToAkamai] (self=0x216b408, _cmd=<optimized out>) at AkamaiDaemon.m:1177
#7 0x00007f1c5b4d9168 in -[AkamaiDaemon initWithArgc:argv:] (self=0x216b408, _cmd=<optimized out>, argc=3, argv=0x7ffd30e7c948) at AkamaiDaemon.m:1095
#8 0x00007f1c5bd41dfd in -[ConfigloaderDaemon initWithArgc:argv:] (self=0x216b408, _cmd=<optimized out>, argc=2, argv=0x7f1c5c489170 <__lll_lock_wait+48>) at ConfigloaderDaemon.m:175
#9 0x0000000000460bdd in -[MCLDaemon initWithArgc:argv:] (self=0x216b408, _cmd=<optimized out>, argc=2, argv=0x7f1c5c489170 <__lll_lock_wait+48>) at mcl/MCLDaemon.m:789
#10 0x000000000045d7e2 in main (argc=2, argv=0x7f1c5c489170 <__lll_lock_wait+48>) at mcl/mcl.m:15
(gdb)
I found/fixed a place where a notification was erroneously posted inside a lock protected region. Does that resolve this deadlock?
Your fix looks plausible. Unfortunately it's a very rare occurrence, and our release process makes it harder to take a new version of gnustep-base quickly than it does to take a simple workaround: call NSLog at least once before creating any threads.
I will try removing the workaround and testing your fix in the future, but for now I have to go with the workaround.
Closing on the assumption that the fix I added was the cause of the problem. If it recurs we can re-open or open another issue.