[Improvement] Fix race condition in loadRole
What would you like to be improved?
The loadRole method (in core/src/main/java/org/apache/gravitino/authorization/AuthorizationRequestContext.java) isn’t thread-safe as it has a race condition. Because hasLoadRole.get() and hasLoadRole.set(true) are separate operations, multiple threads can see false and run the runnable simultaneously.
Suppose two threads call loadRole() at roughly the same time:
- Both check hasLoadRole.get() before it’s set and both see false.
- Both enter the if body and run runnable.run().
- Then both set hasLoadRole.set(true) afterward.
So you can end up with double (or more) executions of the runnable.
Here's a unit test to illustrate:
@Test
public void testLoadRoleRunsOnceEvenWhenInvokedConcurrently() throws Exception {
AuthorizationRequestContext context = new AuthorizationRequestContext();
AtomicInteger counter = new AtomicInteger();
CountDownLatch firstStarted = new CountDownLatch(1);
CountDownLatch allowFinish = new CountDownLatch(1);
Thread firstInvocation =
new Thread(
() ->
context.loadRole(
() -> {
counter.incrementAndGet();
firstStarted.countDown();
try {
allowFinish.await(5, TimeUnit.SECONDS);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}));
firstInvocation.start();
try {
assertTrue(firstStarted.await(5, TimeUnit.SECONDS));
context.loadRole(() -> counter.incrementAndGet());
assertEquals(1, counter.get(), "loadRole should execute runnable only once");
} finally {
allowFinish.countDown();
firstInvocation.join();
}
context.loadRole(() -> counter.incrementAndGet());
assertEquals(1, counter.get(), "Subsequent loadRole calls should be ignored");
}
How should we improve?
AtomicBoolean makes single operations thread-safe, but not combinations of them, use compareAndSet() instead.
@HatshMehta112 you need to comment here so I can assign it to you.
Your fix does fix the race condition, but I think you may need to do a little more. There are two issues here I think 1) If runnable.run() fails, the flag stays true, even though the role didn’t load. 2) The meaning has changed from "has loaded" to "has started loading" this may have implications for the code that calls loadRole().
@justinmclean Good catch — you’re right that the current implementation sets the flag before running runnable.run(), which means it could remain true even if the load fails. It also changes the flag’s meaning from “has loaded” to “has started loading.”
Can we do it like
if (!hasLoadRole.compareAndSet(false, true)) {
return;
}
try {
runnable.run();
} catch (Exception e) {
hasLoadRole.set(false); // Reset flag if loading fails
throw e;
}
This ensures the flag still means “has loaded” and handles failure cases correctly.
That's better and fixes 1, but I think it's best to check how this is called to see if “has loaded” to “has started loading" has an impact.
@justinmclean, I observed the following issues:
- The method is called from
JcasbinAuthorizer.loadRolePrivilege(). - The runnable modifies shared state (
allowEnforcer,denyEnforcer). - It performs I/O operations (database reads).
- Multiple threads could corrupt the enforcer state if they run concurrently.
Recommendation
It’s better to implement double-checked locking as shown below:
public void loadRole(Runnable runnable) {
if (hasLoadRole.get()) {
return;
}
synchronized (this) {
// Double-check after acquiring lock (prevents race)
if (hasLoadRole.get()) {
return;
}
try {
runnable.run(); // Execute the loading
hasLoadRole.set(true); // Mark as loaded ONLY after success
} catch (Exception e) {
// Don't set flag on failure - allow retry
throw e;
}
}
}
If role loading failsfor any reason the flag remains false, allowing a retry.
I’d appreciate your feedback or suggestions on this.
Yes, that looks much better.
Yes, that looks much better.
@justinmclean Thank you for your feedback. I’ve updated the PR with the requested changes. Please review it at your convenience.