store
store copied to clipboard
Flaky deadlock during shutdown
Environment Details
- EclipseStore Version: 2.1.3
- JDK version: 21.0.1
- OS: Windows 11
- Used frameworks: JUnit 5
Describe the bug
Calling EmbeddedStorageManager.close() sometimes hangs because of a deadlock.
To Reproduce
I have not been able to reliably reproduce this. I have a bunch of unit tests that shutdown and start a new storage manager to verify that objects were actually persisted. Occasionally the shutdown in a random test will hang indefinitely.
Expected behavior
Success or exception, but not deadlock.
Additional context
I was able to find some details of the deadlock using jstack:
Found one Java-level deadlock:
=============================
"Eclipse Serializer-StorageChannel-0":
waiting to lock monitor 0x00000257712e9f90 (object 0x000000043bfc4318, a org.eclipse.store.storage.types.StorageBackupItemQueue$Default$Item),
which is held by "Eclipse Serializer-StorageBackupHandler"
"Eclipse Serializer-StorageBackupHandler":
waiting to lock monitor 0x000002576d396ec0 (object 0x000000043b4caaa0, a org.eclipse.store.storage.types.StorageLiveDataFile$Default),
which is held by "Eclipse Serializer-StorageChannel-0"
Java stack information for the threads listed above:
===================================================
"Eclipse Serializer-StorageChannel-0":
at org.eclipse.store.storage.types.StorageBackupItemQueue$Default.internalEnqueueItem(StorageBackupItemQueue.java:96)
- waiting to lock <0x000000043bfc4318> (a org.eclipse.store.storage.types.StorageBackupItemQueue$Default$Item)
at org.eclipse.store.storage.types.StorageBackupItemQueue$Default.enqueueCopyingItem(StorageBackupItemQueue.java:67)
at org.eclipse.store.storage.types.StorageFileWriterBackupping$Default.writeTransactionEntryDelete(StorageFileWriterBackupping.java:187)
at org.eclipse.store.storage.types.StorageFileManager$Default.writeTransactionsEntryFileDeletion(StorageFileManager.java:1248)
at org.eclipse.store.storage.types.StorageFileManager$Default.deleteFile(StorageFileManager.java:1566)
at org.eclipse.store.storage.types.StorageFileManager$Default.deletePendingFile(StorageFileManager.java:1433)
at org.eclipse.store.storage.types.StorageFileManager$Default$$Lambda/0x000002572840f5a8.accept(Unknown Source)
at org.eclipse.store.storage.types.StorageLiveDataFile$Default.lambda$executeIfUnsuedData$0(StorageLiveDataFile.java:257)
at org.eclipse.store.storage.types.StorageLiveDataFile$Default$$Lambda/0x0000025728630978.accept(Unknown Source)
at org.eclipse.store.storage.types.StorageLiveFile$Abstract.executeIfUnsued(StorageLiveFile.java:118)
- locked <0x000000043b4caaa0> (a org.eclipse.store.storage.types.StorageLiveDataFile$Default)
at org.eclipse.store.storage.types.StorageLiveDataFile$Default.executeIfUnsuedData(StorageLiveDataFile.java:256)
at org.eclipse.store.storage.types.StorageFileManager$Default.internalCheckForCleanup(StorageFileManager.java:1465)
at org.eclipse.store.storage.types.StorageFileManager$Default.incrementalFileCleanupCheck(StorageFileManager.java:1397)
at org.eclipse.store.storage.types.StorageChannel$Default.lambda$performFileCleanupCheck$0(StorageChannel.java:315)
at org.eclipse.store.storage.types.StorageChannel$Default$$Lambda/0x0000025728425038.perform(Unknown Source)
at org.eclipse.store.storage.types.StorageChannelHousekeepingResult.create(StorageChannelHousekeepingResult.java:33)
at org.eclipse.store.storage.types.StorageChannel$Default.performFileCleanupCheck(StorageChannel.java:314)
at org.eclipse.store.storage.types.StorageHousekeepingBroker$Default.performFileCleanupCheck(StorageHousekeepingBroker.java:113)
at org.eclipse.store.storage.types.StorageChannel$Default.houseKeepingCheckFileCleanup(StorageChannel.java:412)
at org.eclipse.store.storage.types.StorageChannel$Default$$Lambda/0x00000257284103e8.perform(Unknown Source)
at org.eclipse.store.storage.types.StorageChannel$Default.houseKeeping(StorageChannel.java:248)
at org.eclipse.store.storage.types.StorageChannel$Default.work(StorageChannel.java:479)
at org.eclipse.store.storage.types.StorageChannel$Default.run(StorageChannel.java:536)
at java.lang.Thread.runWith([email protected]/Thread.java:1596)
at java.lang.Thread.run([email protected]/Thread.java:1583)
"Eclipse Serializer-StorageBackupHandler":
at org.eclipse.store.storage.types.StorageFile$Abstract.close(StorageFile.java)
- waiting to lock <0x000000043b4caaa0> (a org.eclipse.store.storage.types.StorageLiveDataFile$Default)
at org.eclipse.store.storage.types.StorageLiveFile$Abstract.unregisterUsageClosing(StorageLiveFile.java:217)
at org.eclipse.store.storage.types.StorageBackupItemQueue$Default.processNextItem(StorageBackupItemQueue.java:135)
- locked <0x000000043bfc4318> (a org.eclipse.store.storage.types.StorageBackupItemQueue$Default$Item)
at org.eclipse.store.storage.types.StorageBackupHandler$Default.run(StorageBackupHandler.java:270)
at java.lang.Thread.runWith([email protected]/Thread.java:1596)
at java.lang.Thread.run([email protected]/Thread.java:1583)
Found 1 deadlock.
And here's the relevant part of the main thread's stack trace:
org.eclipse.store.storage.types.StorageChannelTask$Abstract.waitOnCompletion(StorageChannelTask.java:166)
org.eclipse.store.storage.types.StorageSystem$Default.internalShutdown(StorageSystem.java:553)
org.eclipse.store.storage.types.StorageSystem$Default.shutdown(StorageSystem.java:647)
org.eclipse.store.storage.embedded.types.EmbeddedStorageManager$Default.shutdown(EmbeddedStorageManager.java:408)
org.eclipse.store.storage.types.StorageController.close(StorageController.java:70)
Many thanks for that stack-trace. This is a bug that must be investigated.
Looks like housekeeping and backup are not correctly synchronized if a obsolete file should be removed…