HikariCP icon indicating copy to clipboard operation
HikariCP copied to clipboard

Failed to validate connection org.mariadb.jdbc.MariaDbConnection (Connection.setNetworkTimeout cannot be called on a closed connection).

Open duplxey opened this issue 5 years ago • 7 comments

Environment

HikariCP version: 3.3.1
JDK version     : 1.8.0_111
Database        : 5.5.60-MariaDB
Driver version  : 2.4.2 (org.mariadb.jdbc.mariadb-java-client)

Problem

HikariCP & database connection works well as long as it doesn't idle for more than ~10 minutes. If you try to execute a query after 10 minutes of idling the following warnings & errors will be thrown:

[JDA MainWS-ReadThread] WARN com.zaxxer.hikari.pool.PoolBase - HikariPool-1 - Failed to validate connection org.mariadb.jdbc.MariaDbConnection@2b6d3853 (Connection.setNetworkTimeout cannot be called on a closed connection). Possibly consider using a shorter maxLifetime value.
java.sql.SQLTransientConnectionException: HikariPool-1 - Connection is not available, request timed out after 30025ms.
	at com.zaxxer.hikari.pool.HikariPool.createTimeoutException(HikariPool.java:697)
	at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:196)
	at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:161)
	at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:100)
	at net.cubeverse.cvdiscordbot.database.Database.getConnection(Database.java:45)
	at net.cubeverse.cvdiscordbot.database.DiscordVerifyDAO.isVerified(DiscordVerifyDAO.java:81)
	at net.cubeverse.cvdiscordbot.command.discord.VerifyCommand.execute(VerifyCommand.java:25)
	at net.cubeverse.cvdiscordbot.core.command.manager.CommandManager.executeDiscordCommand(CommandManager.java:34)
	at net.cubeverse.cvdiscordbot.listeners.MessageListeners.onPrivateMessageReceived(MessageListeners.java:21)
	at net.dv8tion.jda.core.hooks.ListenerAdapter.onEvent(ListenerAdapter.java:397)
	at net.dv8tion.jda.core.hooks.InterfacedEventManager.handle(InterfacedEventManager.java:84)
	at net.dv8tion.jda.core.handle.MessageCreateHandler.handleInternally(MessageCreateHandler.java:111)
	at net.dv8tion.jda.core.handle.SocketHandler.handle(SocketHandler.java:37)
	at net.dv8tion.jda.core.requests.WebSocketClient.onDispatch(WebSocketClient.java:868)
	at net.dv8tion.jda.core.requests.WebSocketClient.onEvent(WebSocketClient.java:766)
	at net.dv8tion.jda.core.requests.WebSocketClient.handleEvent(WebSocketClient.java:745)
	at net.dv8tion.jda.core.requests.WebSocketClient.onBinaryMessage(WebSocketClient.java:903)
	at com.neovisionaries.ws.client.ListenerManager.callOnBinaryMessage(ListenerManager.java:368)
	at com.neovisionaries.ws.client.ReadingThread.callOnBinaryMessage(ReadingThread.java:270)
	at com.neovisionaries.ws.client.ReadingThread.handleBinaryFrame(ReadingThread.java:990)
	at com.neovisionaries.ws.client.ReadingThread.handleFrame(ReadingThread.java:749)
	at com.neovisionaries.ws.client.ReadingThread.main(ReadingThread.java:108)
	at com.neovisionaries.ws.client.ReadingThread.runMain(ReadingThread.java:64)
	at com.neovisionaries.ws.client.WebSocketThread.run(WebSocketThread.java:45)
Caused by: java.sql.SQLException: Connection.setNetworkTimeout cannot be called on a closed connection
	at org.mariadb.jdbc.internal.util.exceptions.ExceptionMapper.getSqlException(ExceptionMapper.java:277)
	at org.mariadb.jdbc.MariaDbConnection.setNetworkTimeout(MariaDbConnection.java:1700)
	at com.zaxxer.hikari.pool.PoolBase.setNetworkTimeout(PoolBase.java:552)
	at com.zaxxer.hikari.pool.PoolBase.isConnectionAlive(PoolBase.java:168)
	at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:185)
	... 22 more

Establishing the connection

I establish the connection with the following code and return it via #getConnection(). I've tried experimenting around with maxLifetime, but I couldn't solve the issue. I've also checked MySQL timeout settings to make sure maxLifetime connection is shorter than wait_timeout.

MariaDB [(none)]> SHOW VARIABLES LIKE '%timeout';
+----------------------------+----------+
| Variable_name              | Value    |
+----------------------------+----------+
| connect_timeout            | 10       |
| delayed_insert_timeout     | 300      |
| innodb_lock_wait_timeout   | 50       |
| innodb_rollback_on_timeout | OFF      |
| interactive_timeout        | 28800    |
| lock_wait_timeout          | 31536000 |
| net_read_timeout           | 30       |
| net_write_timeout          | 60       |
| slave_net_timeout          | 3600     |
| thread_pool_idle_timeout   | 60       |
| wait_timeout               | 28800    |
+----------------------------+----------+
public class Database {

    private final HikariDataSource hikari;

    public Database() {
        // Setting up the hikari config
        HikariConfig hikariConfig = new HikariConfig();
        hikariConfig.setJdbcUrl("jdbc:mysql://ip:port/database");
        hikariConfig.setUsername("xxxxx");
        hikariConfig.setPassword("xxxxx");
        hikariConfig.addDataSourceProperty("cachePrepStmts", "true");
        hikariConfig.addDataSourceProperty("prepStmtCacheSize", "250");
        hikariConfig.addDataSourceProperty("prepStmtCacheSqlLimit", "2048");
        hikariConfig.addDataSourceProperty("useServerPrepStmts", "true");
        hikariConfig.addDataSourceProperty("useLocalSessionState", "true");
        hikariConfig.addDataSourceProperty("rewriteBatchedStatements", "true");
        hikariConfig.addDataSourceProperty("cacheResultSetMetadata", "true");
        hikariConfig.addDataSourceProperty("cacheServerConfiguration", "true");
        hikariConfig.addDataSourceProperty("elideSetAutoCommits", "true");
        hikariConfig.addDataSourceProperty("maintainTimeStats", "false");
        hikariConfig.addDataSourceProperty("maxLifetime", "600000");
        // Source: https://github.com/brettwooldridge/HikariCP/wiki/MySQL-Configuration

        // Initialize a new hikari object
        hikari = new HikariDataSource(hikariConfig);
    }

    public Connection getConnection() {
        try {
            System.out.println(S.INFO.getPrefix() + "Successfully established the database connection.");
            return hikari.getConnection();
        } catch (SQLException e) {
            System.out.println(S.ERROR.getPrefix() + "Failed to establish the database connection.");
            e.printStackTrace();
        }
        return null;
    }
}

duplxey avatar Jul 01 '19 11:07 duplxey

Hi, similar problem here. Configuration:

HikariCP version: 2.7.9 (Springboot 2.0.5)
JDK version     : 1.8.0_202
Database        : MariaDB 10.2.23
Driver version  : org.mariadb.jdbc:mariadb-java-client:2.2.6

Connection is established through Spring as described in https://github.com/brettwooldridge/HikariCP/wiki/Spring-Hibernate-with-Annotations

MariaDB's wait_timeout is set to 31 minutes while HikariCP's maxLifetime is set to 30 minutes.

Initialization and usage works fine at first (pool of 10 connections). After leaving the connections idle for some time (in our testing, somewhere between 10 and 20 minutes), the same behavior will start to happen (Connection.setNetworkTimeout cannot be called on a closed connection).

This happens both on Azure-deployed MariaDB instances and local MariaDB instances. At first we thought this was caused by network agents in between Spring and MariaDB (such as Tomcat or nginx), but it does not seem to be the case (happens in various network configurations, independent of Tomcat/nginx stack).

Lowering maxLifetime below 10 minutes seems to work fine, but is a bit... unaesthetic... considering it was intended to be 30 minutes.

nbusseneau avatar Aug 05 '19 17:08 nbusseneau

Lowering maxLifetime below 10 minutes seems to work fine, but is a bit... unaesthetic... considering it was intended to be 30 minutes.

I can confirm that lowering the maxLifetime to about 5 minutes works fine.

duplxey avatar Aug 09 '19 08:08 duplxey

In my case, MariaDB's wait_timeout was set to 600 s (by the docker container which uses the Debian package default instead of the upstream default 28800 s). Setting the wait_timeout for the MariaDB docker container to something bigger than the maxLifetime fixed this for me.

I'm wondering, would the defaults be a good though? The readme says:

We strongly recommend setting this value, and it should be several seconds shorter than any database or infrastructure imposed connection time limit.

With the default values of MariaDB (28800 s = 480 min) and HikariCP (30 min) that would be a far bigger difference than a few seconds. Would it be better to set wait_timeout to something closer to 30 min?

darioseidl avatar Nov 03 '21 13:11 darioseidl

I'm experiencing this exact problem, with HikariCP 3.4.5 and MariaDB Java Client 2.7.4. I don't know whether it's a MariaDB Java Client bug or a HikariCP one, some input from HikariCP dev could help to shred light on this.

I know there are already clues about this at #1328 and #1594, for instance, but I see that very similar problems seem to exist with other dbms (like PostgreSQL or Oracle, see for instance #1705 for the former and #1855 for the latter), so I'm inclined to think the problem lies in HikariCP.

I would like to stress the fact that it's not just a warning: yes, I get the warning in the logs:

2021-11-10 20:51:15.827  WARN 12993 --- [   scheduling-1] c.z.h.p.PoolBase                         : main - Failed to validate connection org.mariadb.jdbc.MariaDbConnection@42b92fef ((conn=211900992) Connection.setNetworkTimeout cannot be called on a closed connection). Possibly consider using a shorter maxLifetime value.

but I also get an actual exception that prevents my code from getting a connection from the pool, in the end breaking higher level operations. For instance, here my application has failed to properly rollback a transaction because it couldn't get a connection from the pool: the exception was thrown at 2021-11-10 20:51:30.862, that is 15 seconds after the above warning was printed in the log:

Caused by: org.springframework.transaction.TransactionSystemException: Could not roll back JPA transaction; nested exception is org.hibernate.exception.JDBCConnectionException: Unable to acquire JDBC Connection
	at org.springframework.orm.jpa.JpaTransactionManager.doRollback(JpaTransactionManager.java:593)
	at org.springframework.transaction.support.AbstractPlatformTransactionManager.processRollback(AbstractPlatformTransactionManager.java:835)
	at org.springframework.transaction.support.AbstractPlatformTransactionManager.rollback(AbstractPlatformTransactionManager.java:809)
	at org.springframework.transaction.interceptor.TransactionAspectSupport.completeTransactionAfterThrowing(TransactionAspectSupport.java:672)
	at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:392)
	at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:119)
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
	at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:215)
	at com.sun.proxy.$Proxy198.handleMessage(Unknown Source)
	at org.springframework.integration.dispatcher.AbstractDispatcher.tryOptimizedDispatch(AbstractDispatcher.java:115)
	... 42 more
Caused by: org.hibernate.exception.JDBCConnectionException: Unable to acquire JDBC Connection
	at org.hibernate.exception.internal.SQLExceptionTypeDelegate.convert(SQLExceptionTypeDelegate.java:48)
	at org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:42)
	at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:113)
	at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:99)
	at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.acquireConnectionIfNeeded(LogicalConnectionManagedImpl.java:111)
	at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.getPhysicalConnection(LogicalConnectionManagedImpl.java:138)
	at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.getConnectionForTransactionManagement(LogicalConnectionManagedImpl.java:276)
	at org.hibernate.resource.jdbc.internal.AbstractLogicalConnectionImplementor.rollback(AbstractLogicalConnectionImplementor.java:121)
	at org.hibernate.resource.transaction.backend.jdbc.internal.JdbcResourceLocalTransactionCoordinatorImpl$TransactionDriverControlImpl.rollback(JdbcResourceLocalTransactionCoordinatorImpl.java:304)
	at org.hibernate.engine.transaction.internal.TransactionImpl.rollback(TransactionImpl.java:142)
	at org.springframework.orm.jpa.JpaTransactionManager.doRollback(JpaTransactionManager.java:589)
	... 51 more
Caused by: java.sql.SQLTransientConnectionException: main - Connection is not available, request timed out after 10000ms.
	at com.zaxxer.hikari.pool.HikariPool.createTimeoutException(HikariPool.java:695)
	at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:197)
	at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:162)
	at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:128)
	at org.hibernate.engine.jdbc.connections.internal.DatasourceConnectionProviderImpl.getConnection(DatasourceConnectionProviderImpl.java:122)
	at org.hibernate.internal.NonContextualJdbcConnectionAccess.obtainConnection(NonContextualJdbcConnectionAccess.java:38)
	at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.acquireConnectionIfNeeded(LogicalConnectionManagedImpl.java:108)
	... 57 more
Caused by: java.sql.SQLSyntaxErrorException: (conn=211900992) Connection.setNetworkTimeout cannot be called on a closed connection
	at org.mariadb.jdbc.internal.util.exceptions.ExceptionFactory.createException(ExceptionFactory.java:62)
	at org.mariadb.jdbc.internal.util.exceptions.ExceptionFactory.create(ExceptionFactory.java:176)
	at org.mariadb.jdbc.MariaDbConnection.setNetworkTimeout(MariaDbConnection.java:1679)
	at com.zaxxer.hikari.pool.PoolBase.setNetworkTimeout(PoolBase.java:560)
	at com.zaxxer.hikari.pool.PoolBase.isConnectionAlive(PoolBase.java:173)
	at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:186)
	at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:162)
	at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:128)
	at org.hibernate.engine.jdbc.connections.internal.DatasourceConnectionProviderImpl.getConnection(DatasourceConnectionProviderImpl.java:122)
	at org.hibernate.internal.NonContextualJdbcConnectionAccess.obtainConnection(NonContextualJdbcConnectionAccess.java:38)
	at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.acquireConnectionIfNeeded(LogicalConnectionManagedImpl.java:108)
	at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.getPhysicalConnection(LogicalConnectionManagedImpl.java:138)
	at org.hibernate.engine.jdbc.internal.StatementPreparerImpl.connection(StatementPreparerImpl.java:50)
	at org.hibernate.engine.jdbc.internal.StatementPreparerImpl$1.doPrepare(StatementPreparerImpl.java:90)
	at org.hibernate.engine.jdbc.internal.StatementPreparerImpl$StatementPreparationTemplate.prepareStatement(StatementPreparerImpl.java:176)
	at org.hibernate.engine.jdbc.internal.StatementPreparerImpl.prepareStatement(StatementPreparerImpl.java:81)
	at org.hibernate.hql.internal.ast.exec.BasicExecutor.doExecute(BasicExecutor.java:68)
	at org.hibernate.hql.internal.ast.exec.BasicExecutor.execute(BasicExecutor.java:50)
	at org.hibernate.hql.internal.ast.QueryTranslatorImpl.executeUpdate(QueryTranslatorImpl.java:458)
	at org.hibernate.engine.query.spi.HQLQueryPlan.performExecuteUpdate(HQLQueryPlan.java:377)
	at org.hibernate.internal.SessionImpl.executeUpdate(SessionImpl.java:1438)
	at org.hibernate.query.internal.AbstractProducedQuery.doExecuteUpdate(AbstractProducedQuery.java:1701)
	at org.hibernate.query.internal.AbstractProducedQuery.executeUpdate(AbstractProducedQuery.java:1683)
	at org.springframework.integration.jpa.core.DefaultJpaOperations.executeUpdateWithNamedQuery(DefaultJpaOperations.java:98)
	at org.springframework.integration.jpa.core.JpaExecutor.executeOutboundJpaOperation(JpaExecutor.java:440)
	at org.springframework.integration.jpa.outbound.JpaOutboundGateway.handleRequestMessage(JpaOutboundGateway.java:83)
	at org.springframework.integration.handler.AbstractReplyProducingMessageHandler.handleMessageInternal(AbstractReplyProducingMessageHandler.java:134)
	at org.springframework.integration.handler.AbstractMessageHandler.handleMessage(AbstractMessageHandler.java:56)
	at jdk.internal.reflect.GeneratedMethodAccessor96.invoke(Unknown Source)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
	at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344)
	at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198)
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
	at org.springframework.transaction.interceptor.TransactionInterceptor$1.proceedWithInvocation(TransactionInterceptor.java:123)
	at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:388)
	... 47 more

What surprises me is that this occurs on com.zaxxer.hikari.pool.PoolBase.isConnectionAlive() which I guess is supposed to check whether a connection is alive or not. So, this method is doing some kind of test on the connection, suppressing any Exception as an indicator of the fact that the connection is no longer valid and then returning false so that the sorrounding getConnection() knows it has to throw that connection away and create a new one. This is indeed the code:

boolean isConnectionAlive(final Connection connection)
   {
      try {
         try {
            setNetworkTimeout(connection, validationTimeout);

            final int validationSeconds = (int) Math.max(1000L, validationTimeout) / 1000;

            if (isUseJdbc4Validation) {
               return connection.isValid(validationSeconds);
            }

            try (Statement statement = connection.createStatement()) {
               if (isNetworkTimeoutSupported != TRUE) {
                  setQueryTimeout(statement, validationSeconds);
               }

               statement.execute(config.getConnectionTestQuery());
            }
         }
         finally {
            setNetworkTimeout(connection, networkTimeout);

            if (isIsolateInternalQueries && !isAutoCommit) {
               connection.rollback();
            }
         }

         return true;
      }
      catch (Exception e) {
         lastConnectionFailure.set(e);
         logger.warn("{} - Failed to validate connection {} ({}). Possibly consider using a shorter maxLifetime value.",
                     poolName, connection, e.getMessage());
         return false;
      }
   }

So, I would expect the exception on setNetworkTimeout to be caught by the surrounding catch clause, and be suppressed, but it somehow gets thrown instead...

mauromol avatar Nov 12 '21 11:11 mauromol

I had a look at com.zaxxer.hikari.pool.HikariPool.getConnection(long). It seems like it throws the lastConnectionFailure set by isConnectionAlive if:

  1. isConnectionAlive returns false (and this is the case when setNetworkTimeout throws an exception)
  2. however the borrow timeout has passed, so another attempt to get a new connection won't be made
  3. hence, the exception thrown by setNetworkTimeout will be thrown as the "next exception" of a SQLTransientConnectionException by com.zaxxer.hikari.pool.HikariPool.createTimeoutException(long)

So, this starts to be clearer now: the main problem is that we've exhausted the timeout to get a connection, perhaps because the stale connection is making the borrow operation take longer than required. I will then try to investigate on this further (perhaps I should tune the validation timeout vs the borrow timeout).

However, if I could give a suggestion: tracing the lastConnectionFailure on isConnectionAlive is really misleading. Since the "next" exception is set as the "cause" exception, I am led to think that the problem here is caused by setNetworkTimeout within isConnectionAlive. Instead, this is not the cause of the real problem, which is I'm not giving enough time to HikariCP to get a new connection instead, at least in a scenario in which my connection is not alive any more. So I must either increase the borrow timeout or reduce the validation timeout.

The current stack trace somewhat talks about the borrow timeout, but it's incorrectly saying the cause is due to what happens in isConnectionAlive. I suspect this misunderstanding is also the cause of so many bug reports about this scenario.

mauromol avatar Nov 12 '21 11:11 mauromol

what is the real reson of this issue?

wangchaoforever avatar Oct 09 '23 06:10 wangchaoforever

As I have mentioned here https://github.com/brettwooldridge/HikariCP/issues/1705#issuecomment-1985365270 it looks like there is a timeout for checking if connection is dead, and somehow stale connection prevents pool from creating a new one. In my case stale connections were created by hard resetting VM with database.

KazzmanK avatar Mar 08 '24 09:03 KazzmanK