HikariCP icon indicating copy to clipboard operation
HikariCP copied to clipboard

Synchronizing System.getProperties in generatePoolName / Number creates a deadlock-able window.

Open kialambroca opened this issue 1 year ago • 7 comments

Since Hikari is a library, it should not rely on synchronizing on any objects not within it's own scope, Interactions between code either more shallow or deeper in the stack can create opportunities for a deadlock. It certainly shouldn't synchronize on on the System.getProperties() Properties object.

The code appears to facilitate sharing of the pool name across classloaders. This could be obviated by using the hashcode of a specific bootloader loaded class.

https://github.com/brettwooldridge/HikariCP/blob/2e07be7b325b05f7225611c60f421e253e1107a5/src/main/java/com/zaxxer/hikari/HikariConfig.java#L1169

kialambroca avatar Sep 05 '23 18:09 kialambroca

I can think of potential sequence of events that could lead to a deadlock:

  1. Thread A enters the synchronized block in the code to update the "com.zaxxer.hikari.pool_number" property. It acquires the lock on System.getProperties().
  2. Thread B, from another class, also needs to access a system property using System.getProperty(String key). Since System.getProperty is synchronized internally, Thread B attempts to acquire the same lock on System.getProperties().
  3. Thread A, while still inside the synchronized block, tries to execute System.getProperty("com.zaxxer.hikari.pool_number") to read the property value.
  4. Thread B is already holding the lock on System.getProperties() while executing its System.getProperty call. This means Thread A is blocked because it cannot acquire the lock held by Thread B.
  5. Thread B, while still inside the synchronized System.getProperty call, tries to acquire the lock for the synchronized block in your code to access or modify another property.
  6. Thread B cannot acquire the lock because Thread A is holding it while executing your synchronized block.

There is a possibility of both Thread A and Thread B are in a deadlock situation.

bvsvas avatar Sep 06 '23 04:09 bvsvas

That is exactly what's happening. There are several thousand third party classes that are being processed + initialized in another thread and some of them call System.getProperties. so even though we have a small window, in practice, in a slower, containerized environment with limited cores, we're seeing the deadlock occur approximately 1/4 times.

kialambroca avatar Sep 06 '23 04:09 kialambroca

If someone creates a pull request with a fix for this issue, I will gladly merge it.

lfbayer avatar Sep 15 '23 16:09 lfbayer

We can fix this in a variety of ways, but I want to be clear on the potential side effects for other users of the library other than my use case. There is always one single HikariCP per JVM, correct? two wars with different database configurations still have a single HikariCP ? If that's right, I'm happy to deliver a PR.

kialambroca avatar Sep 21 '23 23:09 kialambroca

@kialambroca There can be multiple HikariCP libraries loaded into a single JVM. Two wars is a perfect example of the use-case that the current code is attempting to account for.

lfbayer avatar Oct 22 '23 16:10 lfbayer

// AS-IS
// Pool number is global to the VM to avoid overlapping pool numbers in classloader scoped environments
threadPoolName = "HikariPool-1, 2, 3, ...";

// TO-BE
static UUID = system.getUUID(); 
AtomicInteger poolNum;

public generatePoolName() {
  threadPoolName = "HikariPool-{UUID}-{poolNum.incrementAndGet()}"; // HikariPool-{UUID}-1, 2, 3, ...

  // On same JVM, war1 pool name: HikariPool-{war1's UUID}-1, 2, 3, ..
  // war2 pool name: HikariPool-{war2's UUID}-1, 2, 3, ..
}

Hi @lfbayer ! I think you may already consider this way, but can't we just change threadPoolName format like above?

With System.setProperty("com.zaxxer.hikari.pool_number", next);, I think it's really hard to generate global unique, incremental poolNumber over JVM cause System.setProperty doesn't support locking over several wars well..

WDYT? :)

injae-kim avatar Mar 27 '24 09:03 injae-kim

Hello, @lfbayer

I've been investigating the implementation of a lock mechanism using system properties and found it quite challenging. In search of an alternative, I developed a port-based lock approach to address race conditions from classes loaded multiple times by different classloaders.

I've documented this approach in an example and test case here: PortLock Example. While it's still a work in progress, I'm keen to hear your thoughts on this direction or if we should consider refining the poolName format as an alternative.

Looking forward to your feedback.

Thank you.

owlur avatar Apr 04 '24 16:04 owlur