HikariCP icon indicating copy to clipboard operation
HikariCP copied to clipboard

Not removing a dead connection from the pool

Open greenwoodct opened this issue 2 years ago • 17 comments

HikariCP Version: 3.4.2, 3.4.5, and DEV

Symptom: Log entries: Caused by: java.sql.SQLRecoverableException: IO Error: Invalid Operation, NOT Connected at oracle.jdbc.driver.T4CConnection.doSetNetworkTimeout(T4CConnection.java:7676) at oracle.jdbc.driver.PhysicalConnection.setNetworkTimeout(PhysicalConnection.java:9408) at com.zaxxer.hikari.pool.PoolBase.setNetworkTimeout(PoolBase.java:556) at com.zaxxer.hikari.pool.PoolBase.isConnectionAlive(PoolBase.java:169) at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:185) Once getting into the above, connection never recovers.

Root cause: HikariCP 3.4.2 HikariPool.java line 185 HikariCP 3.4.5 HikariPool.java line 186 if (poolEntry.isMarkedEvicted() || (elapsedMillis(poolEntry.lastAccessed, now) > aliveBypassWindowMs && !isConnectionAlive(poolEntry.connection)))

If the above condition is true, the connection will be removed. But it will never be removed because of the and condition (&&).

Did not test on dev branch. But we may face the same issue. See HikariPool.java line 170: if (poolEntry.isMarkedEvicted() || (elapsedMillis(poolEntry.lastAccessed, now) > aliveBypassWindowMs && isConnectionDead(poolEntry.connection)))

The above and condition (&&) can cause issues.

Possible solution: For HikariCP 3.4.2 and HikariCP 3.4.5 (Did not try other versions): if (poolEntry.isMarkedEvicted() || elapsedMillis(poolEntry.lastAccessed, now) > aliveBypassWindowMs || !isConnectionAlive(poolEntry.connection))

For DEV branch: if (poolEntry.isMarkedEvicted() || elapsedMillis(poolEntry.lastAccessed, now) > aliveBypassWindowMs || isConnectionDead(poolEntry.connection))

Thanks

greenwoodct avatar Aug 01 '21 17:08 greenwoodct

@greenwoodct That still doesn't make sense to me. When looking at the 3.4.2 branch (but the same would be true in other branches), the exception thrown in isConnectionAlive() will be caught, and false returned, which should result in the connection being closed.

https://github.com/brettwooldridge/HikariCP/blob/ab2f7209cd6a28e9a3ceba5b86eaa69a09074870/src/main/java/com/zaxxer/hikari/pool/PoolBase.java#L178-L183

EDIT: Just to be clear, the code in HikariPool is:

if ( /*evicted*/ || (/*not recently used*/ && !isConnectionAlive(poolEntry.connection)) )

The logical breakdown of the code is:

  • Close the connection if it is evicted, or
  • Close the connection if it as not been used in the last 500ms AND it is no longer alive

The key here being !isConnectionAlive() -- i.e. NOT isConnectionAlive(). In the case of an exception in isConnectionAlive() a value of false is returned. Therefore !isConnectionAlive() evaluates to true, and the connection will be closed.

The purpose of the "if it as not been used in the last 500ms" test is a performance optimization. Testing aliveness is expensive. If the connection has been used within the past 500ms we assume it is still alive. Removing this check, and always checking aliveness is possible, but will definitely have a large impact on performance.

To completely disable the last-used bypass, you can set the system property com.zaxxer.hikari.aliveBypassWindowMs=0.

However, if you are seeing this exception, then clearly the last-used time is greater than 500ms, and thus the isConnectionAlive() method is being called (and returning false), so I still don't see how the connection fails to close.

brettwooldridge avatar Aug 02 '21 00:08 brettwooldridge

In spring cloud environment, consul health check leads to frequent connection usage: Consul Health Check --> Spring-Boot-Actuator : Aggregation of /health endpoints --> One of the metrics is DB connection

Due to the Cloud network maintenance needs, network glitches up to five minutes are expected.

Once the isConnectionAlive() returns false, and when elapsedMillis(poolEntry.lastAccessed, now) < aliveBypassWindowMs, next call is: return poolEntry.createProxyConnection(leakTaskFactory.schedule(poolEntry), now);

Now connection returns to Spring-Boot-Actuator and leads to exception due to the connection is bad.

Next, Spring-Boot-Actuator will re-try in a tight loop. the lastAccessed gets updated and staled connections are trapped in the pool forever.

Possible solution: For HikariCP 3.4.2 and HikariCP 3.4.5 (Did not try other versions): if (poolEntry.isMarkedEvicted() || !isConnectionAlive(poolEntry.connection))

For DEV branch: if (poolEntry.isMarkedEvicted() || isConnectionDead(poolEntry.connection))

Or: Adding -Dcom.zaxxer.hikari.aliveBypassWindowMs=0

Thanks

greenwoodct avatar Aug 02 '21 04:08 greenwoodct

Ok, let's run through two scenarios.

This is the code (v3.4.2) in getConnection():

if (poolEntry.isMarkedEvicted() || (elapsedMillis(poolEntry.lastAccessed, now) > aliveBypassWindowMs && !isConnectionAlive(poolEntry.connection))) {
   closeConnection(poolEntry, poolEntry.isMarkedEvicted() ? EVICTED_CONNECTION_MESSAGE : DEAD_CONNECTION_MESSAGE);
   timeout = hardTimeout - elapsedMillis(startTime);
}
else {
   metricsTracker.recordBorrowStats(poolEntry, startTime);
   return poolEntry.createProxyConnection(leakTaskFactory.schedule(poolEntry), now);
}

Scenario 1

None of the connections have been used in the past 500ms (they're basically idle).

Flow:

  • Consul Heath Check calls Spring-Boot-Actuator : Aggregation of /health endpoints
  • Spring-Boot-Actuator calls HikariDataSource.getConnection()
  • poolEntry.isMarkedEvicted() is false, || condition continues
  • elapsedMillis(poolEntry.lastAccessed, now) > aliveBypassWindowMs is true, && condition continues
  • isConnectionAlive(poolEntry.connection) is false because java.sql.SQLRecoverableException: IO Error: Invalid Operation, NOT Connected is thrown and caught
  • closeConnection(poolEntry, poolEntry.isMarkedEvicted() ...) is called
  • We're all good, right?

Scenario 2

All of the connections have been used in the past 500ms (very busy pool).

Flow:

  • Consul Heath Check calls Spring-Boot-Actuator : Aggregation of /health endpoints
  • Spring-Boot-Actuator calls HikariDataSource.getConnection()
  • poolEntry.isMarkedEvicted() is false, || condition continues
  • elapsedMillis(poolEntry.lastAccessed, now) > aliveBypassWindowMs is false, && condition is never tested
  • else branch taken, poolEntry.createProxyConnection(leakTaskFactory.schedule(poolEntry), now) is returned

Questions:

  • If the exception occurs after the Connection is returned from the pool, why does the stacktrace you included indicate that it occurred in isConnectionAlive()? As far as I can tell, there is no way for that exception to occur in isConnectionAlive() and for that Connection to be returned.
  • If the Spring-Boot-Actuator is where an exception is encountered, i.e. ignoring the exception you provided, what does Spring-Boot-Actuator do with that Connection?

This is an important point: Any interaction with a Connection API (or Statement, ResultSet, etc.) that results in a SQLException will result in the Connection being evicted immediately from the pool, if the SQLState starts with "08" or matches one of these codes. You can see an example of such an eviction here.

So still, I cannot see how a dead connection persists in the pool. I'm looking for the loophole, if there is one, that I seem to be missing.

brettwooldridge avatar Aug 02 '21 18:08 brettwooldridge

The issues have been lasted for at least a month. May be the issue is not that simple. Error log: org.springframework.jdbc.CannotGetJdbcConnectionException: Failed to obtain JDBC Connection; nested exception is java.sql.SQLTransientConnectionException: HikariPool-1 - Connection is not available, request timed out after 30000ms. at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:82) at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:324) at org.springframework.boot.actuate.jdbc.DataSourceHealthIndicator.getProduct(DataSourceHealthIndicator.java:122) at org.springframework.boot.actuate.jdbc.DataSourceHealthIndicator.doDataSourceHealthCheck(DataSourceHealthIndicator.java:107) at org.springframework.boot.actuate.jdbc.DataSourceHealthIndicator.doHealthCheck(DataSourceHealthIndicator.java:102) at org.springframework.boot.actuate.health.AbstractHealthIndicator.health(AbstractHealthIndicator.java:82) at org.springframework.boot.actuate.health.HealthIndicator.getHealth(HealthIndicator.java:37) at org.springframework.boot.actuate.health.HealthEndpointWebExtension.getHealth(HealthEndpointWebExtension.java:95) at org.springframework.boot.actuate.health.HealthEndpointWebExtension.getHealth(HealthEndpointWebExtension.java:43) at org.springframework.boot.actuate.health.HealthEndpointSupport.getContribution(HealthEndpointSupport.java:108) at org.springframework.boot.actuate.health.HealthEndpointSupport.getAggregateHealth(HealthEndpointSupport.java:119) at org.springframework.boot.actuate.health.HealthEndpointSupport.getContribution(HealthEndpointSupport.java:105) at org.springframework.boot.actuate.health.HealthEndpointSupport.getHealth(HealthEndpointSupport.java:83) at org.springframework.boot.actuate.health.HealthEndpointSupport.getHealth(HealthEndpointSupport.java:70) at org.springframework.boot.actuate.health.HealthEndpointWebExtension.health(HealthEndpointWebExtension.java:81) at org.springframework.boot.actuate.health.HealthEndpointWebExtension.health(HealthEndpointWebExtension.java:70) at jdk.internal.reflect.GeneratedMethodAccessor86.invoke(Unknown Source) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.springframework.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:282) at org.springframework.boot.actuate.endpoint.invoke.reflect.ReflectiveOperationInvoker.invoke(ReflectiveOperationInvoker.java:77) at org.springframework.boot.actuate.endpoint.annotation.AbstractDiscoveredOperation.invoke(AbstractDiscoveredOperation.java:60) at org.springframework.boot.actuate.endpoint.web.servlet.AbstractWebMvcEndpointHandlerMapping$ServletWebOperationAdapter.handle(AbstractWebMvcEndpointHandlerMapping.java:305) at org.springframework.boot.actuate.endpoint.web.servlet.AbstractWebMvcEndpointHandlerMapping$OperationHandler.handle(AbstractWebMvcEndpointHandlerMapping.java:388) at jdk.internal.reflect.GeneratedMethodAccessor85.invoke(Unknown Source) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:190) at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:138) at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:105) at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:879) at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:793) at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87) at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1040) at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:943) at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1006) at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:898) at javax.servlet.http.HttpServlet.service(HttpServlet.java:634) at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:883) at javax.servlet.http.HttpServlet.service(HttpServlet.java:741) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:53) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:100) at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.springframework.web.filter.FormContentFilter.doFilterInternal(FormContentFilter.java:93) at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.springframework.boot.actuate.metrics.web.servlet.WebMvcMetricsFilter.doFilterInternal(WebMvcMetricsFilter.java:109) at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:201) at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:202) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96) at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:541) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:139) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:92) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:74) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:343) at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:373) at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:65) at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:868) at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1594) at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) at java.base/java.lang.Thread.run(Thread.java:834) Caused by: java.sql.SQLTransientConnectionException: HikariPool-1 - Connection is not available, request timed out after 30000ms. at com.zaxxer.hikari.pool.HikariPool.createTimeoutException(HikariPool.java:689) at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:196) at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:161) at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:128) at org.springframework.jdbc.datasource.DataSourceUtils.fetchConnection(DataSourceUtils.java:158) at org.springframework.jdbc.datasource.DataSourceUtils.doGetConnection(DataSourceUtils.java:116) at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:79) ... 76 common frames omitted Caused by: java.sql.SQLRecoverableException: IO Error: Invalid Operation, NOT Connected at oracle.jdbc.driver.T4CConnection.doSetNetworkTimeout(T4CConnection.java:7676) at oracle.jdbc.driver.PhysicalConnection.setNetworkTimeout(PhysicalConnection.java:9408) at com.zaxxer.hikari.pool.PoolBase.setNetworkTimeout(PoolBase.java:556) at com.zaxxer.hikari.pool.PoolBase.isConnectionAlive(PoolBase.java:169) at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:185) ... 81 common frames omitted Caused by: oracle.net.ns.NetException: Invalid Operation, NOT Connected at oracle.net.nt.TcpNTAdapter.setOption(TcpNTAdapter.java:344) at oracle.net.ns.NSProtocol.setOption(NSProtocol.java:501) at oracle.net.ns.NSProtocol.setSocketReadTimeout(NSProtocol.java:748) at oracle.jdbc.driver.T4CConnection.doSetNetworkTimeout(T4CConnection.java:7673) ... 85 common frames omitted

2021-08-02 14:26:03,590 WARN DataSource health check failed ...... 2021-08-02 14:26:03,353 WARN DataSource health check failed ...... 2021-08-02 14:26:01,301 WARN DataSource health check failed ...

You are right, it does not look like a HikariCP issue

greenwoodct avatar Aug 02 '21 18:08 greenwoodct

Can you enable debug logging for com.zaxxer.hikari? It is not noisy.

brettwooldridge avatar Aug 02 '21 19:08 brettwooldridge

Project owner provided a good explanation.

greenwoodct avatar Aug 02 '21 19:08 greenwoodct

Enabled the following for a few months: spring.datasource.hikari.leakDetectionThreshold=29000 logging.level.com.zaxxer.hikari.HikariConfig=DEBUG

Thanks

greenwoodct avatar Aug 03 '21 00:08 greenwoodct

We are trying to use: -Dcom.zaxxer.hikari.aliveBypassWindowMs=0

The error logs were shorter: 2021-08-02 14:00:45,395 WARN HikariPool-1 - Failed to validate connection oracle.jdbc.driver.T4CConnection@1d407818 (IO Error: Invalid Operation, NOT Connected). Possibly consider using a shorter maxLifetime value.

Thanks

greenwoodct avatar Aug 03 '21 00:08 greenwoodct

When -Dcom.zaxxer.hikari.aliveBypassWindowMs=0 Created a lot of log entries 996 events (8/2/21 11:00:00.000 AM to 8/2/21 11:01:00.000 AM)

After reviewing your getConnection function, I prefer to implement it as follow:

public Connection getConnection(final long hardTimeout) throws SQLException { suspendResumeLock.acquire(); final long startTime = currentTime();

  try {
     long timeout = hardTimeout;
     
     do {
        PoolEntry poolEntry = connectionBag.borrow(timeout, MILLISECONDS);
        if (poolEntry == null) {
           break; // We timed out... break and throw exception
        }
        final long current_step_start_time = currentTime();
        if (poolEntry.isMarkedEvicted()) {
        	closeConnection(poolEntry, EVICTED_CONNECTION_MESSAGE);
        }
    	else if (!isConnectionAlive(poolEntry.connection)) {
    		closeConnection(poolEntry, DEAD_CONNECTION_MESSAGE);
    		long time_left = aliveBypassWindowMs - elapsedMillis(current_step_start_time, currentTime());
    		if (time_left > 0 ) {
    			Thread.sleep(time_left);
    		}
        }
        else {
           metricsTracker.recordBorrowStats(poolEntry, startTime);
           return poolEntry.createProxyConnection(leakTaskFactory.schedule(poolEntry), current_step_start_time);
        }
        timeout = hardTimeout - elapsedMillis(startTime);
     } while (timeout > 0L);

     metricsTracker.recordBorrowTimeoutStats(startTime);
     throw createTimeoutException(startTime);
  }
  catch (InterruptedException e) {
     Thread.currentThread().interrupt();
     throw new SQLException(poolName + " - Interrupted during connection acquisition", e);
  }
  finally {
     suspendResumeLock.release();
  }

}

greenwoodct avatar Aug 03 '21 03:08 greenwoodct

@greenwoodct Thanks for the feedback. However, you have not demonstrated where and how dead connections are not being removed from the pool.

The log message 2021-08-02 14:00:45,395 WARN HikariPool-1 - Failed to validate connection indicates removal from the pool.

And this...

When -Dcom.zaxxer.hikari.aliveBypassWindowMs=0 Created a lot of log entries 996 events (8/2/21 11:00:00.000 AM to 8/2/21 11:01:00.000 AM)

What is going on? In normal operation there should be tens of logs per hour (when DEBUG is enabled). Can you provide an example of these logs?

brettwooldridge avatar Aug 03 '21 16:08 brettwooldridge

The reason for the 996 events in a minute when aliveBypassWindowMs=0 is: getConnection --> isConnectionAlive --> setNetworkTimeout See log: at com.zaxxer.hikari.pool.PoolBase.setNetworkTimeout(PoolBase.java:556) at com.zaxxer.hikari.pool.PoolBase.isConnectionAlive(PoolBase.java:169) at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:185)

setNetworkTimeout throws exception immediately as network has a glitch and the connection is dead.

Now it continue in the do {..} while (timeout > 0L) loop as aliveBypassWindowMs is zero.

Sample log: 2021-08-02 11:00:55,983 WARN HikariPool-1 - Failed to validate connection oracle.jdbc.driver.T4CConnection@6f96b69b (IO Error: Invalid Operation, NOT Connected). Possibly consider using a shorter maxLifetime value. 2021-08-02 11:00:55,622 WARN HikariPool-1 - ... 2021-08-02 11:00:55,602 WARN DataSource health check failed... 2021-08-02 11:00:55,601 WARN HikariPool-1 - ... 2021-08-02 11:00:55,578 WARN HikariPool-1 - ... 2021-08-02 11:00:55,576 WARN HikariPool-1 - ...

greenwoodct avatar Aug 04 '21 00:08 greenwoodct

Just to echo greenwoodct, we seems to be having similar issue as well on our application on Azure connecting to MariaDB. I will see if I can grab some additional debug logs and show here as well, thanks.

NgSekLong avatar Sep 01 '21 04:09 NgSekLong

Our issues are also related to Azure environment. Connections were from Azure to on-premise DB servers. Look like it is not a Hikari issue, as Hikari removed all the staled connections. Java TCP Stack failed to recover from the outage.

greenwoodct avatar Sep 01 '21 14:09 greenwoodct

@greenwoodct Can you expand on what you found? My team is seeing the same behavior (HikariCP losing connections until it runs down to 0 connecting to an Azure DB). By chance, was Microsoft able to provide any resolution, or were they able to pinpoint the issue?

danielmustafa avatar Oct 22 '21 18:10 danielmustafa

For our case, we mitigated it by adding a socket timeout on the JDBC connection string.

While the root caused is still unknown, at least this solve our issue, see if this applicable to your cases as well, cheers!

NgSekLong avatar Oct 23 '21 11:10 NgSekLong

We seems to be the same behavior when we met a network incident(network break 15 mins).

It's interesting about we monitor result after network back: Our Jetty service have 5 connection pools by HikariCP to handle different DB connections of Oracle(x 4) and MySQL(x 1). It's work well on MySQL connection pools after network back, and auto reconnect MySQL smoothly. But it cannot auto reconnect on Oracle connection pools after network back, and always got error as below. Finally, we restart Jetty and then all 5 connection pools work well.

We guess Oracle JDBC driver maybe some connection behavior cannot work smoothly on HikariCP.

Caused by: org.apache.ibatis.exceptions.PersistenceException: 
### Error querying database.  Cause: org.springframework.jdbc.CannotGetJdbcConnectionException: Failed to obtain JDBC Connection; nested exception is java.sql.SQLTransientConnectionException: payment2 - Connection is not available, request timed out after 5003ms.
### The error may exist in db/payment2/PayDataMapper.xml
### The error may involve com.***.ecpayment.internal_webapp.dao.payment2.db.mybatis.PsiPayDataMapper.get
### The error occurred while executing a query
### Cause: org.springframework.jdbc.CannotGetJdbcConnectionException: Failed to obtain JDBC Connection; nested exception is java.sql.SQLTransientConnectionException: payment2 - Connection is not available, request timed out after 5003ms.
	at org.apache.ibatis.exceptions.ExceptionFactory.wrapException(ExceptionFactory.java:30)
	at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:153)
	at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:145)
	at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:140)
	at org.apache.ibatis.session.defaults.DefaultSqlSession.selectOne(DefaultSqlSession.java:76)
	at jdk.internal.reflect.GeneratedMethodAccessor99.invoke(Unknown Source)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
	at org.mybatis.spring.SqlSessionTemplate$SqlSessionInterceptor.invoke(SqlSessionTemplate.java:427)
	... 105 common frames omitted
Caused by: org.springframework.jdbc.CannotGetJdbcConnectionException: Failed to obtain JDBC Connection; nested exception is java.sql.SQLTransientConnectionException: payment2 - Connection is not available, request timed out after 5003ms.
	at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:83)
	at org.mybatis.spring.transaction.SpringManagedTransaction.openConnection(SpringManagedTransaction.java:80)
	at org.mybatis.spring.transaction.SpringManagedTransaction.getConnection(SpringManagedTransaction.java:67)
	at org.apache.ibatis.executor.BaseExecutor.getConnection(BaseExecutor.java:337)
	at org.apache.ibatis.executor.SimpleExecutor.prepareStatement(SimpleExecutor.java:86)
	at org.apache.ibatis.executor.SimpleExecutor.doQuery(SimpleExecutor.java:62)
	at org.apache.ibatis.executor.BaseExecutor.queryFromDatabase(BaseExecutor.java:325)
	at org.apache.ibatis.executor.BaseExecutor.query(BaseExecutor.java:156)
	at org.apache.ibatis.executor.BaseExecutor.query(BaseExecutor.java:136)
	at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:151)
	... 112 common frames omitted
Caused by: java.sql.SQLTransientConnectionException: payment2 - Connection is not available, request timed out after 5003ms.
	at com.zaxxer.hikari.pool.HikariPool.createTimeoutException(HikariPool.java:696)
	at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:197)
	at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:162)
	at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:128)
	at org.springframework.jdbc.datasource.DataSourceUtils.fetchConnection(DataSourceUtils.java:159)
	at org.springframework.jdbc.datasource.DataSourceUtils.doGetConnection(DataSourceUtils.java:117)
	at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:80)
	... 121 common frames omitted
Caused by: java.sql.SQLRecoverableException: IO Error: Invalid Operation, NOT Connected
	at oracle.jdbc.driver.T4CConnection.doSetNetworkTimeout(T4CConnection.java:10041)
	at oracle.jdbc.driver.PhysicalConnection.setNetworkTimeout(PhysicalConnection.java:10237)
	at com.zaxxer.hikari.pool.PoolBase.setNetworkTimeout(PoolBase.java:566)
	at com.zaxxer.hikari.pool.PoolBase.isConnectionAlive(PoolBase.java:173)
	at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:186)
	... 126 common frames omitted
Caused by: oracle.net.ns.NetException: Invalid Operation, NOT Connected
	at oracle.net.nt.TcpNTAdapter.setOption(TcpNTAdapter.java:747)
	at oracle.net.ns.NSProtocol.setOption(NSProtocol.java:725)
	at oracle.net.ns.NSProtocol.setSocketReadTimeout(NSProtocol.java:1040)
	at oracle.jdbc.driver.T4CConnection.doSetNetworkTimeout(T4CConnection.java:10038)
	... 130 common frames omitted

lib version:

BOOT-INF/lib/mysql-connector-java-8.0.28.jar
BOOT-INF/lib/ojdbc11-21.5.0.0.jar
BOOT-INF/lib/HikariCP-4.0.3.jar
BOOT-INF/lib/mybatis-3.5.11.jar
BOOT-INF/lib/mybatis-spring-2.0.7.jar
BOOT-INF/lib/mybatis-spring-boot-starter-2.2.2.jar

MySQL connection pool config:

spring.datasource.invoice.driver-class-name: com.mysql.cj.jdbc.Driver
spring.datasource.invoice.username: ***
spring.datasource.invoice.password: ***
spring.datasource.invoice.poolName: invoice
spring.datasource.invoice.minimumIdle: 10
spring.datasource.invoice.maximumPoolSize: 50
spring.datasource.invoice.idleTimeout: 20000
spring.datasource.invoice.maxLifetime: 30000
spring.datasource.invoice.connectionTimeout: 5000
spring.datasource.invoice.leakDetectionThreshold: 20000

one of Oracle connection pool config:

spring.datasource.payment2.driver-class-name: oracle.jdbc.OracleDriver
spring.datasource.payment2.username: ***
spring.datasource.payment2.password: ***
spring.datasource.payment2.poolName: payment2
spring.datasource.payment2.minimumIdle: 10
spring.datasource.payment2.maximumPoolSize: 50
spring.datasource.payment2.idleTimeout: 20000
spring.datasource.payment2.maxLifetime: 30000
spring.datasource.payment2.connectionTimeout: 5000
spring.datasource.payment2.leakDetectionThreshold: 20000

dickchung avatar Dec 21 '23 12:12 dickchung

We faced similar situation when our database(Oracle) is down for a while and then going up again, we use kuberneties in our producation enviroment and it calls readiness endpoint of actuator every 10 seconds when db is down, does anybody find solution for this issue?

amirzaman8088 avatar Feb 24 '24 11:02 amirzaman8088