helidon icon indicating copy to clipboard operation
helidon copied to clipboard

Helidon DBClient does not trigger an Exception when no sane DB connection can be obtained , post application initialization

Open msteffi opened this issue 2 years ago • 6 comments

In case the database server was up during the application initialization and for some reason the server went down ,and then when the application tries to insert a row , we expect a DBClientException to be thrown after the connection timeout , which does not seem to happen .

Environment Details

  • Helidon Version: 2.5.1
  • Helidon SE
  • JDK version: java 17
  • OS: Oracle Linux 7
  • Docker version (if applicable): NA

Problem Description

We are using DBClient in our Helidon SE project with the below configurations url: jdbc:mysql://127.0.0.1:3306/dbName?failOverReadOnly=false&characterEncoding=utf-8&useSSL=false&allowPublicKeyRetrieval=true username: root password: passwd poolName: mysql initializationFailTimeout: -1 connectionTimeout: 2000

So if the MySql server is up and running , we see no issues as such . But in case the database server was up during the application initialization and for some reason the server went down ,and then when the application tries to insert a row , we expect a DBClientException to be thrown after the connection timeout of 2000 ms , which does not seem to happen .

Snippet of the thread dump of the helidon application

"helidon-jdbc-dbclient-thread-pool-5" #23 daemon prio=5 os_prio=0 cpu=0.74ms elapsed=1453.67s tid=0x00007f90ecd635b0 nid=0x32c2 waiting on condition [0x00007f90b58f7000] java.lang.Thread.State: TIMED_WAITING (parking) at jdk.internal.misc.Unsafe.park([email protected]/Native Method) - parking to wait for <0x00000007198b57c0> (a java.util.concurrent.SynchronousQueue$TransferQueue) at java.util.concurrent.locks.LockSupport.parkNanos([email protected]/LockSupport.java:252) at java.util.concurrent.SynchronousQueue$TransferQueue.transfer([email protected]/SynchronousQueue.java:704) at java.util.concurrent.SynchronousQueue.poll([email protected]/SynchronousQueue.java:903) at com.zaxxer.hikari.util.ConcurrentBag.borrow(ConcurrentBag.java:151) at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:164) at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:146) at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:100) at io.helidon.dbclient.jdbc.HikariConnectionPool.connection(HikariConnectionPool.java:53) at io.helidon.dbclient.jdbc.JdbcDbClient$JdbcExecute$$Lambda$647/0x0000000800e9f790.get(Unknown Source) 7:48 "mysql connection adder" #59 daemon prio=5 os_prio=0 cpu=639.77ms elapsed=1127.21s tid=0x0000556c94ef3f50 nid=0x3f02 waiting on condition [0x00007f908b4fb000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep([email protected]/Native Method) at com.zaxxer.hikari.util.UtilityElf.quietlySleep(UtilityElf.java:53) at com.zaxxer.hikari.pool.HikariPool$PoolEntryCreator.call(HikariPool.java:745) at com.zaxxer.hikari.pool.HikariPool$PoolEntryCreator.call(HikariPool.java:712) at java.util.concurrent.FutureTask.run([email protected]/FutureTask.java:264) at java.util.concurrent.ThreadPoolExecutor.runWorker([email protected]/ThreadPoolExecutor.java:1136) at java.util.concurrent.ThreadPoolExecutor$Worker.run([email protected]/ThreadPoolExecutor.java:635) at java.lang.Thread.run([email protected]/Thread.java:833)

Steps to reproduce

[//]: 1 Database server must be up and running when the application starts . Bring down or make the db server non-reachable . [//]: 2 Make your helidon application to do some operation on the table (insert) and then you will observe that the operation "execute" just hangs . dbClient .execute( exec -> exec.createNamedInsert("insert") .indexedParam(entity ) .execute()) .thenApply(e -> entity ) .exceptionally(throwable -> { logger.atWarning().log( "Failed to insert : %s", throwable); return null; });

msteffi avatar Aug 11 '22 04:08 msteffi

Update from customer: createNamedGet is throwing a DBClient Exception whereas other apis like createNamedInsert / update and delete are not .

tomas-langer avatar Aug 19 '22 16:08 tomas-langer

Hi Helidon Experts, Thanks for clarification on impact. We would like to have this fix/patch in Helidon 2.5.1/2.52. Is there an ETA on this story? Currently we are blocking our story until a fix/workaround is provided, does this sound reasonable? Thanks Chandra

chandrasrt avatar Aug 22 '22 14:08 chandrasrt

namedDml api as well has the same issue . It does not seem to throw a DBClientException if a query (in our case it was create table) is executed when the DB server is unavailable .

msteffi avatar Aug 23 '22 03:08 msteffi

Tested similar scenario with my old Pokemon example. Running curl -i -X POST -d '{"id":8,"name":"Raticate","idType":2}' http://localhost:8079/db/pokemon caused some warnings to be logged:

2022.08.23 13:45:21 WARNING com.zaxxer.hikari.pool.PoolBase Thread[helidon-jdbc-dbclient-thread-pool-4,5,jdbc-dbclient-thread-pool]: mysql - Failed to validate connection com.mysql.cj.jdbc.ConnectionImpl@4b83f76 (No operations allowed after connection closed.). Possibly consider using a shorter maxLifetime value.
2022.08.23 13:45:21 WARNING com.zaxxer.hikari.pool.PoolBase Thread[helidon-jdbc-dbclient-thread-pool-4,5,jdbc-dbclient-thread-pool]: mysql - Failed to validate connection com.mysql.cj.jdbc.ConnectionImpl@372ace49 (No operations allowed after connection closed.). Possibly consider using a shorter maxLifetime value.

But example insert code

        dbClient.execute(exec -> exec
                .createNamedInsert("insert-pokemon")
                .indexedParam(pokemon)
                .execute())
                .thenAccept(count -> response.send("Inserted: " + count + " values\n"))
                .exceptionally(throwable -> sendError(throwable, response));

hangs as described in the bug.

Tomas-Kraus avatar Aug 23 '22 11:08 Tomas-Kraus

Looks like one executorService().submit(...) call is eating exceptions in JdbcStatementDml. :)

Tomas-Kraus avatar Aug 23 '22 14:08 Tomas-Kraus

@msteffi PR https://github.com/oracle/helidon/pull/4772 should fix this for 2.x. You can verify it using https://github.com/Tomas-Kraus/helidon/tree/issue-4719-2.x branch build if you would like to.

Tomas-Kraus avatar Aug 23 '22 15:08 Tomas-Kraus