Race Condition and ABBA Deadlock
Checking whether a bundle is active in celix_bundleContext_useBundles callback suffers from check-then-act race condition.
This race condition is a minor one, but immediate uninstall of a bundle after framework startup may crash the program.
Codecov Report
Merging #645 (0c2c973) into master (91ff81c) will increase coverage by
0.02%. The diff coverage is100.00%.
:exclamation: Current head 0c2c973 differs from pull request most recent head beb910e. Consider uploading reports for the commit beb910e to get more accurate results
@@ Coverage Diff @@
## master #645 +/- ##
==========================================
+ Coverage 81.58% 81.61% +0.02%
==========================================
Files 260 260
Lines 34679 34678 -1
==========================================
+ Hits 28294 28303 +9
+ Misses 6385 6375 -10
| Files Changed | Coverage Δ | |
|---|---|---|
| libs/framework/src/dm_dependency_manager_impl.c | 83.04% <100.00%> (+2.09%) |
:arrow_up: |
... and 3 files with indirect coverage changes
:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more
This PR manifests a deadlock in test_components_ready.
~~More investigation is needed.~~
It turns out to be an insidious ABBA deadlock:
- When
celix_componentsReadyCheck_destroyis called, ready_check'sfsmMutexis locked, andcelix_componentsReadyCheck_destroywill try to use the Celix event loop. - When
celix_componentReadyCheck_checkcallscelix_dependencyManager_allComponentsActive, which in turn tries to check ready_check's readiness, the Celix event loop is occupied and the current thread tries to acquire ready_check'sfsmMutex.
We need to work out a fix for this deadlock before fixing the race condition. @pnoltes
~~A proper fix might be to exclude ready_check from readiness checking so that ready_check's fsmMutex don't have to be acquired for that purpose. For example, the following interface should allow us to deselect ready_check:~~
CELIX_FRAMEWORK_EXPORT size_t celix_framework_useActiveSelectedBundles(celix_framework_t* fw,
bool includeFrameworkBundle,
void* callbackHandle,
bool (*select)(void* handle, const celix_bundle_t* bnd),
void (*use)(void* handle, const celix_bundle_t* bnd));
~~More generally, useBundlesWithOptions could be added.~~
~~Such API changes should be postponed after 2.4.0.~~
The above proposed fix will not work, because the same ABBA deadlock could happen for other bundles when stopping them.
For now I'll leave this PR open as a reminder.