[Bug] [Worker Plugin] deadlock when using Multiple JDBC Drivers
Search before asking
- [x] I had searched in the issues and found no similar issues.
What happened
When a worker receives an SQL task for the first time, it loads the driver class. If there are many tasks of multiple datasource types concurrently for the first time, a deadlock may occur, causing the thread to hang.
see: https://medium.com/@priyaaggarwal24/avoiding-deadlock-when-using-multiple-jdbc-drivers-in-an-application-ce0b9464ecdf
What you expected to happen
Workers run normally.
How to reproduce
A number of SQL task instances from different data source types were already queued up and awaiting execution before the Worker started.
Here is a demo.
public class DeadLockTest {
public static void main(String[] args) throws InterruptedException {
ExecutorService es = Executors.newFixedThreadPool(2);
es.execute(() -> {
try {
Class.forName("com.mysql.cj.jdbc.Driver");
} catch (ClassNotFoundException e) {
throw new RuntimeException(e);
}
});
es.execute(() -> {
try {
Class.forName("com.clickhouse.jdbc.ClickHouseDriver");
} catch (ClassNotFoundException e) {
throw new RuntimeException(e);
}
});
es.shutdown();
while (!es.isTerminated()) {
Thread.sleep(500);
System.out.println("waiting");
}
System.out.println("end");
}
}
openjdk version: 1.8.0_412
jstack result:
Anything else
suggestion: remove Class.forName while getting connection.
Version
3.2.x
Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
Code of Conduct
- [x] I agree to follow this project's Code of Conduct
Using lock can avoid this?
Using lock can avoid this?
I don't think this is a good idea or that it's necessary. Based on the information I've reviewed, it's generally accepted that manually calling Class.forName to load the driver class is no longer required in Java 6 and later versions. Of course, I can't guarantee this applies to all data source types without issues. If you consider this to be a risk, I've also looked into Storm's approach for your reference. https://issues.apache.org/jira/browse/STORM-2527
Using lock can avoid this?
I don't think this is a good idea or that it's necessary. Based on the information I've reviewed, it's generally accepted that manually calling Class.forName to load the driver class is no longer required in Java 6 and later versions. Of course, I can't guarantee this applies to all data source types without issues. If you consider this to be a risk, I've also looked into Storm's approach for your reference. https://issues.apache.org/jira/browse/STORM-2527
OK, I got you, you are right, we don't need to call Class.forName here, it just registers the driver into DriverManager. This seems bug in driver, we should remove Class.forName here.
In additional, we might better initialize the driver and use the driver to getConnection, i am not sure if the DriverManager can find the suitable Driver, since multiple driver might be matched.
@ruanwenjun
After JDBC 4.0 (Java 6) DriverManager will invoke all the registered drivers' acceptsURL method to find out if it is the matched driver by the connection Url
jdbc:mysql://localhost:3306/dolphinscheduler
jdbc:postgresql://localhost:5432/dolphinscheduler
This is the driver match method in DriverManger
public static Driver getDriver(String url)
throws SQLException {
println("DriverManager.getDriver(\"" + url + "\")");
Class<?> callerClass = Reflection.getCallerClass();
// Walk through the loaded registeredDrivers attempting to locate someone
// who understands the given URL.
for (DriverInfo aDriver : registeredDrivers) {
// If the caller does not have permission to load the driver then
// skip it.
if(isDriverAllowed(aDriver.driver, callerClass)) {
try {
if(aDriver.driver.acceptsURL(url)) {
// Success!
println("getDriver returning " + aDriver.driver.getClass().getName());
return (aDriver.driver);
}
} catch(SQLException sqe) {
// Drop through and try the next driver.
}
} else {
println(" skipping: " + aDriver.driver.getClass().getName());
}
}
println("getDriver: no suitable driver");
throw new SQLException("No suitable driver", "08001");
}
So just delete Class.forName and let the DriverManager do the driver class loading job will be better.
I'm glad to fix this issue, you can assign it to me.
@ruanwenjun DriverManager will invoke all the registered drivers' acceptsURL method to find out if it is the matched driver by the connection Url
jdbc:mysql://localhost:3306/dolphinschedulerjdbc:postgresql://localhost:5432/dolphinschedulerThis is the driver match method in DriverManger
public static Driver getDriver(String url) throws SQLException {
println("DriverManager.getDriver(\"" + url + "\")"); Class<?> callerClass = Reflection.getCallerClass(); // Walk through the loaded registeredDrivers attempting to locate someone // who understands the given URL. for (DriverInfo aDriver : registeredDrivers) { // If the caller does not have permission to load the driver then // skip it. if(isDriverAllowed(aDriver.driver, callerClass)) { try { if(aDriver.driver.acceptsURL(url)) { // Success! println("getDriver returning " + aDriver.driver.getClass().getName()); return (aDriver.driver); } } catch(SQLException sqe) { // Drop through and try the next driver. } } else { println(" skipping: " + aDriver.driver.getClass().getName()); } } println("getDriver: no suitable driver"); throw new SQLException("No suitable driver", "08001"); }So just delete
Class.forNameand let the DriverManager do the driver class loading job will be better.
Some URLs may be matched by multiple drivers simultaneously, e.g., hive2 might be matched by hive/kyuubi driver, mysql might be matched by mysql/oceanbase driver?
Some URLs may be matched by multiple drivers simultaneously, e.g., hive2 might be matched by hive/kyuubi driver, mysql might be matched by mysql/oceanbase driver?
@ruanwenjun If multiple drivers are matched, the DriverManager selects the first one registered. This selection process remains the same even when drivers are loaded using Class.forName.
Some URLs may be matched by multiple drivers simultaneously, e.g., hive2 might be matched by hive/kyuubi driver, mysql might be matched by mysql/oceanbase driver?
@ruanwenjun If multiple drivers are matched, the DriverManager selects the first one registered. This selection process remains the same even when drivers are loaded using Class.forName.
So it would be better to get connection by driver rather than driver manager.
So it would be better to get connection by driver rather than driver manager.
@ruanwenjun Exactly!