amoro icon indicating copy to clipboard operation
amoro copied to clipboard

[Bug]: Optimization state blocking in committing

Open xxubai opened this issue 1 year ago • 0 comments

What happened?

When an optimizing process completes and commits, a SQL exception (e.g. sql connection exhaustion, DB corruption) may cause the status to remain blocked in the db while committing and cannot be manually cancelled.

Affects Versions

master/0.7.x

What table formats are you seeing the problem on?

Iceberg

What engines are you seeing the problem on?

AMS

How to reproduce

No response

Relevant log output

2024-08-14 11:52:26,585 ERROR [async-optimizing-commit-executor-5] [org.apache.amoro.server.optimizing.OptimizingQueue] [] - iceberg.db.xxx(tableId=94156) Commit optimizing failed 
org.apache.amoro.server.exception.PersistenceException: org.apache.ibatis.exceptions.PersistenceException: 
### Error updating database.  Cause: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object, borrowMaxWaitDuration=PT1S
### The error may exist in org/apache/amoro/server/persistence/mapper/OptimizingMapper.java (best guess)
### The error may involve org.apache.amoro.server.persistence.mapper.OptimizingMapper.updateOptimizingProcess
### The error occurred while executing an update
### Cause: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object, borrowMaxWaitDuration=PT1S
        at org.apache.amoro.server.exception.AmoroRuntimeException.wrap(AmoroRuntimeException.java:141) ~[amoro-ams-server-0.7.0-incubating.jar:0.7.0-incubating]
        at org.apache.amoro.server.persistence.PersistentBase.doAs(PersistentBase.java:53) ~[amoro-ams-server-0.7.0-incubating.jar:0.7.0-incubating]
        at org.apache.amoro.server.optimizing.OptimizingQueue.access$600(OptimizingQueue.java:73) ~[amoro-ams-server-0.7.0-incubating.jar:0.7.0-incubating]
        at org.apache.amoro.server.optimizing.OptimizingQueue$TableOptimizingProcess.lambda$persistProcessCompleted$11(OptimizingQueue.java:645) ~[amoro-ams-server-0.7.0-incubating.jar:0.7.0-incubating]
        at java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948) ~[?:1.8.0_412]
        at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:647) ~[?:1.8.0_412]
        at org.apache.amoro.server.persistence.PersistentBase.doAsTransaction(PersistentBase.java:61) ~[amoro-ams-server-0.7.0-incubating.jar:0.7.0-incubating]
        at org.apache.amoro.server.optimizing.OptimizingQueue.access$400(OptimizingQueue.java:73) ~[amoro-ams-server-0.7.0-incubating.jar:0.7.0-incubating]
        at org.apache.amoro.server.optimizing.OptimizingQueue$TableOptimizingProcess.persistProcessCompleted(OptimizingQueue.java:643) ~[amoro-ams-server-0.7.0-incubating.jar:0.7.0-incubating]
        at org.apache.amoro.server.optimizing.OptimizingQueue$TableOptimizingProcess.commit(OptimizingQueue.java:566) ~[amoro-ams-server-0.7.0-incubating.jar:0.7.0-incubating]
        at org.apache.amoro.server.table.executor.OptimizingCommitExecutor.execute(OptimizingCommitExecutor.java:53) ~[amoro-ams-server-0.7.0-incubating.jar:0.7.0-incubating]
        at org.apache.amoro.server.table.executor.BaseTableExecutor.executeTask(BaseTableExecutor.java:84) ~[amoro-ams-server-0.7.0-incubating.jar:0.7.0-incubating]
        at org.apache.amoro.server.table.executor.BaseTableExecutor.lambda$scheduleIfNecessary$4(BaseTableExecutor.java:95) ~[amoro-ams-server-0.7.0-incubating.jar:0.7.0-incubating]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_412]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_412]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_412]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_412]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_412]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_412]
        at java.lang.Thread.run(Thread.java:750) [?:1.8.0_412]
Caused by: org.apache.ibatis.exceptions.PersistenceException: 
### Error updating database.  Cause: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object, borrowMaxWaitDuration=PT1S
### The error may exist in org/apache/amoro/server/persistence/mapper/OptimizingMapper.java (best guess)
### The error may involve org.apache.amoro.server.persistence.mapper.OptimizingMapper.updateOptimizingProcess
### The error occurred while executing an update
### Cause: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object, borrowMaxWaitDuration=PT1S
        at org.apache.ibatis.exceptions.ExceptionFactory.wrapException(ExceptionFactory.java:30) ~[mybatis-3.5.15.jar:3.5.15]
        at org.apache.ibatis.session.defaults.DefaultSqlSession.update(DefaultSqlSession.java:199) ~[mybatis-3.5.15.jar:3.5.15]
        at org.apache.ibatis.binding.MapperMethod.execute(MapperMethod.java:67) ~[mybatis-3.5.15.jar:3.5.15]
        at org.apache.ibatis.binding.MapperProxy$PlainMethodInvoker.invoke(MapperProxy.java:141) ~[mybatis-3.5.15.jar:3.5.15]
        at org.apache.ibatis.binding.MapperProxy.invoke(MapperProxy.java:86) ~[mybatis-3.5.15.jar:3.5.15]
        at com.sun.proxy.$Proxy48.updateOptimizingProcess(Unknown Source) ~[?:?]
        at org.apache.amoro.server.optimizing.OptimizingQueue$TableOptimizingProcess.lambda$null$10(OptimizingQueue.java:648) ~[amoro-ams-server-0.7.0-incubating.jar:0.7.0-incubating]
        at org.apache.amoro.server.persistence.PersistentBase.doAs(PersistentBase.java:49) ~[amoro-ams-server-0.7.0-incubating.jar:0.7.0-incubating]
        ... 18 more
Caused by: java.sql.SQLException: Cannot get a connection, pool error Timeout waiting for idle object, borrowMaxWaitDuration=PT1S
        at org.apache.commons.dbcp2.PoolingDataSource.getConnection(PoolingDataSource.java:147) ~[commons-dbcp2-2.9.0.jar:2.9.0]
        at org.apache.commons.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:731) ~[commons-dbcp2-2.9.0.jar:2.9.0]
        at org.apache.ibatis.transaction.jdbc.JdbcTransaction.openConnection(JdbcTransaction.java:145) ~[mybatis-3.5.15.jar:3.5.15]
        at org.apache.ibatis.transaction.jdbc.JdbcTransaction.getConnection(JdbcTransaction.java:67) ~[mybatis-3.5.15.jar:3.5.15]
        at org.apache.ibatis.executor.BaseExecutor.getConnection(BaseExecutor.java:348) ~[mybatis-3.5.15.jar:3.5.15]
        at org.apache.ibatis.executor.SimpleExecutor.prepareStatement(SimpleExecutor.java:89) ~[mybatis-3.5.15.jar:3.5.15]
        at org.apache.ibatis.executor.SimpleExecutor.doUpdate(SimpleExecutor.java:49) ~[mybatis-3.5.15.jar:3.5.15]
        at org.apache.ibatis.executor.BaseExecutor.update(BaseExecutor.java:117) ~[mybatis-3.5.15.jar:3.5.15]
        at org.apache.ibatis.executor.CachingExecutor.update(CachingExecutor.java:76) ~[mybatis-3.5.15.jar:3.5.15]
        at org.apache.ibatis.session.defaults.DefaultSqlSession.update(DefaultSqlSession.java:197) ~[mybatis-3.5.15.jar:3.5.15]
        at org.apache.ibatis.binding.MapperMethod.execute(MapperMethod.java:67) ~[mybatis-3.5.15.jar:3.5.15]
        at org.apache.ibatis.binding.MapperProxy$PlainMethodInvoker.invoke(MapperProxy.java:141) ~[mybatis-3.5.15.jar:3.5.15]
        at org.apache.ibatis.binding.MapperProxy.invoke(MapperProxy.java:86) ~[mybatis-3.5.15.jar:3.5.15]
        at com.sun.proxy.$Proxy48.updateOptimizingProcess(Unknown Source) ~[?:?]
        at org.apache.amoro.server.optimizing.OptimizingQueue$TableOptimizingProcess.lambda$null$10(OptimizingQueue.java:648) ~[amoro-ams-server-0.7.0-incubating.jar:0.7.0-incubating]
        at org.apache.amoro.server.persistence.PersistentBase.doAs(PersistentBase.java:49) ~[amoro-ams-server-0.7.0-incubating.jar:0.7.0-incubating]
        ... 18 more
Caused by: java.util.NoSuchElementException: Timeout waiting for idle object, borrowMaxWaitDuration=PT1S
        at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:316) ~[commons-pool2-2.12.0.jar:2.12.0]
        at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:233) ~[commons-pool2-2.12.0.jar:2.12.0]
        at org.apache.commons.dbcp2.PoolingDataSource.getConnection(PoolingDataSource.java:141) ~[commons-dbcp2-2.9.0.jar:2.9.0]
        at org.apache.commons.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:731) ~[commons-dbcp2-2.9.0.jar:2.9.0]
        at org.apache.ibatis.transaction.jdbc.JdbcTransaction.openConnection(JdbcTransaction.java:145) ~[mybatis-3.5.15.jar:3.5.15]
        at org.apache.ibatis.transaction.jdbc.JdbcTransaction.getConnection(JdbcTransaction.java:67) ~[mybatis-3.5.15.jar:3.5.15]
        at org.apache.ibatis.executor.BaseExecutor.getConnection(BaseExecutor.java:348) ~[mybatis-3.5.15.jar:3.5.15]
        at org.apache.ibatis.executor.SimpleExecutor.prepareStatement(SimpleExecutor.java:89) ~[mybatis-3.5.15.jar:3.5.15]
        at org.apache.ibatis.executor.SimpleExecutor.doUpdate(SimpleExecutor.java:49) ~[mybatis-3.5.15.jar:3.5.15]
        at org.apache.ibatis.executor.BaseExecutor.update(BaseExecutor.java:117) ~[mybatis-3.5.15.jar:3.5.15]
        at org.apache.ibatis.executor.CachingExecutor.update(CachingExecutor.java:76) ~[mybatis-3.5.15.jar:3.5.15]
        at org.apache.ibatis.session.defaults.DefaultSqlSession.update(DefaultSqlSession.java:197) ~[mybatis-3.5.15.jar:3.5.15]
        at org.apache.ibatis.binding.MapperMethod.execute(MapperMethod.java:67) ~[mybatis-3.5.15.jar:3.5.15]
        at org.apache.ibatis.binding.MapperProxy$PlainMethodInvoker.invoke(MapperProxy.java:141) ~[mybatis-3.5.15.jar:3.5.15]
        at org.apache.ibatis.binding.MapperProxy.invoke(MapperProxy.java:86) ~[mybatis-3.5.15.jar:3.5.15]
        at com.sun.proxy.$Proxy48.updateOptimizingProcess(Unknown Source) ~[?:?]
        at org.apache.amoro.server.optimizing.OptimizingQueue$TableOptimizingProcess.lambda$null$10(OptimizingQueue.java:648) ~[amoro-ams-server-0.7.0-incubating.jar:0.7.0-incubating]
        at org.apache.amoro.server.persistence.PersistentBase.doAs(PersistentBase.java:49) ~[amoro-ams-server-0.7.0-incubating.jar:0.7.0-incubating]
        ... 18 more

Anything else

No response

Are you willing to submit a PR?

  • [X] Yes I am willing to submit a PR!

Code of Conduct

  • [X] I agree to follow this project's Code of Conduct

xxubai avatar Aug 15 '24 06:08 xxubai