spring-batch icon indicating copy to clipboard operation
spring-batch copied to clipboard

OptimisticLockingFailureException updating step execution after commit failure

Open spring-projects-issues opened this issue 14 years ago • 51 comments
trafficstars

Quinton McCombs opened BATCH-1767 and commented

It appears that if the commit fails, spring batch will get an OptimisticLockingFailureException when it tries to revert the changes to the step execution. In my particular case, I have a callback through EclipseLink to update history tables before a transaction is committed. If a failure occurs during this callback, the commit fails.

From looking through the code and the attached log file, the step execution is updated and committed before the main transaction is committed. When the commit fails, the old values for the step execution (including version) are updated to the values before the chuck started. When control returns to AbstractStep.execute(), the OptimisticLockingFailureException is thrown when the step execution is updated with the failed status because the new version had already been committed to the database.

2011-07-06 17:40:29,494 ERROR SimpleAsyncTaskExecutor-1 [org.springframework.batch.core.step.AbstractStep] Encountered an error saving batch meta data. This job is now in an unknown state and should not be restarted.
org.springframework.dao.OptimisticLockingFailureException: Attempt to update step execution id=3225 with wrong version (35), where current version is 36
   at org.springframework.batch.core.repository.dao.JdbcStepExecutionDao.updateStepExecution(JdbcStepExecutionDao.java:185)
   at org.springframework.batch.core.repository.support.SimpleJobRepository.update(SimpleJobRepository.java:171)
   at sun.reflect.GeneratedMethodAccessor130.invoke(Unknown Source)
   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:309)
   at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
   at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
   at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:110)
   at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
   at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:202)
   at $Proxy77.update(Unknown Source)
   at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:244)
   at org.springframework.batch.core.job.SimpleStepHandler.handleStep(SimpleStepHandler.java:135)
   at org.springframework.batch.core.job.flow.JobFlowExecutor.executeStep(JobFlowExecutor.java:61)
   at org.springframework.batch.core.job.flow.support.state.StepState.handle(StepState.java:60)
   at org.springframework.batch.core.job.flow.support.SimpleFlow.resume(SimpleFlow.java:144)
   at org.springframework.batch.core.job.flow.support.SimpleFlow.start(SimpleFlow.java:124)
   at org.springframework.batch.core.job.flow.support.state.SplitState$1.call(SplitState.java:91)
   at org.springframework.batch.core.job.flow.support.state.SplitState$1.call(SplitState.java:89)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at java.lang.Thread.run(Thread.java:662)

Affects: 2.1.7

Attachments:

Referenced from: pull request https://github.com/spring-projects/spring-batch/pull/591

23 votes, 31 watchers

spring-projects-issues avatar Jul 07 '11 11:07 spring-projects-issues

Dave Syer commented

You need to flush JPA changes before Spring Batch tries to commit its own meta data. The best way to do that is in the ItemWriter. Any reason you can't do that?

spring-projects-issues avatar Jul 11 '11 00:07 spring-projects-issues

Quinton McCombs commented

We are already using the JpaWriter supplied with Spring Batch which performs the flush. We have a requirement to audit all database changes and identify the changes made during a single transaction. For this reason, we implemented our auditing using an Eclipselink pre-commit hook (Hibernate can do this as well). While the transaction level auditing does not make much sense for a batch job, it is what is used by the rest of the application.

spring-projects-issues avatar Jul 11 '11 09:07 spring-projects-issues

Dave Syer commented

If you are using transaction hooks you are going to have to tread very carefully. You need to somehow ensure that the vendor hook runs before the Spring transaction synchronization, or else not use it. The easiest way would be to run your audit code in an ItemWriter or ItemWriteListener, but if you find a cleverer way and it can be exposed as a framework feature, please let us know. If you can provide a sample project (preferably self-contained, with an in-memory database, and built using maven, but we'll take whatever you have), that would help. Or if you want to make a suggestion about a patch you can fork on github and send a pull request (the README has more information).

spring-projects-issues avatar Jul 12 '11 01:07 spring-projects-issues

Ramkumar Kirhsnan commented

We've also got the same failure after the one of the steps failed to commit due to database exception thrown while writing into the database. Following is the stack trace when i tried to run the job for only one item

Jul 13 10:45:30,467 INFO  populateAccountSnapshotReportJob [SimpleAsyncTaskExecutor-2] [org.springframework.batch.core.job.SimpleStepHandler] 133 - Executing step: [processAccountsStep]
Jul 13 10:45:30,514 DEBUG populateAccountSnapshotReportJob [SimpleAsyncTaskExecutor-2] [com.om.dh.batch.core.partition.KeySetPartitioner] 79 - partition: partition-1, keys=[1000000069567]
Jul 13 10:45:30,576 DEBUG populateAccountSnapshotReportJob [workerThreadPool-1-1] [com.om.dh.batch.core.partition.PartitionKeySetPagingItemReader] 104 - Reading page 0
Jul 13 10:45:30,576 DEBUG populateAccountSnapshotReportJob [workerThreadPool-1-1] [com.om.dh.batch.item.adapter.AbstractDelegatingAdapter] 60 - processing started for 1000000069567
Jul 13 10:45:31,217 DEBUG populateAccountSnapshotReportJob [workerThreadPool-1-1] [com.om.dh.batch.item.adapter.AbstractDelegatingAdapter] 103 - processing complete for 1000000069567 elapsed=641ms
Jul 13 10:45:31,279 ERROR populateAccountSnapshotReportJob [workerThreadPool-1-1] [org.hibernate.util.JDBCExceptionReporter] 465 - Data truncation: Data too long for column 'calculation_details' at row 1
Jul 13 10:45:31,279 ERROR populateAccountSnapshotReportJob [workerThreadPool-1-1] [org.hibernate.event.def.AbstractFlushingEventListener] 543 - Could not synchronize database state with session
org.hibernate.exception.DataException: Could not execute JDBC batch update
at org.hibernate.exception.SQLStateConverter.convert(SQLStateConverter.java:100)
at org.hibernate.exception.JDBCExceptionHelper.convert(JDBCExceptionHelper.java:66)
at org.hibernate.jdbc.AbstractBatcher.executeBatch(AbstractBatcher.java:275)
at org.hibernate.engine.ActionQueue.executeActions(ActionQueue.java:266)
at org.hibernate.engine.ActionQueue.executeActions(ActionQueue.java:167)
at org.hibernate.event.def.AbstractFlushingEventListener.performExecutions(AbstractFlushingEventListener.java:321)
at org.hibernate.event.def.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:50)
at org.hibernate.impl.SessionImpl.flush(SessionImpl.java:1027)
at org.hibernate.impl.SessionImpl.managedFlush(SessionImpl.java:365)
at org.hibernate.transaction.JDBCTransaction.commit(JDBCTransaction.java:137)
at org.springframework.orm.hibernate3.HibernateTransactionManager.doCommit(HibernateTransactionManager.java:655)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.processCommit(AbstractPlatformTransactionManager.java:732)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.commit(AbstractPlatformTransactionManager.java:701)
at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:140)
at org.springframework.batch.core.step.tasklet.TaskletStep$2.doInChunkContext(TaskletStep.java:263)
at org.springframework.batch.core.scope.context.StepContextRepeatCallback.doInIteration(StepContextRepeatCallback.java:76)
at org.springframework.batch.repeat.support.RepeatTemplate.getNextResult(RepeatTemplate.java:367)
at org.springframework.batch.repeat.support.RepeatTemplate.executeInternal(RepeatTemplate.java:214)
at org.springframework.batch.repeat.support.RepeatTemplate.iterate(RepeatTemplate.java:143)
at org.springframework.batch.core.step.tasklet.TaskletStep.doExecute(TaskletStep.java:248)
at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:195)
at org.springframework.batch.core.partition.support.TaskExecutorPartitionHandler$1.call(TaskExecutorPartitionHandler.java:107)
at org.springframework.batch.core.partition.support.TaskExecutorPartitionHandler$1.call(TaskExecutorPartitionHandler.java:1)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.sql.BatchUpdateException: Data truncation: Data too long for column 'calculation_details' at row 1
at com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java:1666)
at com.mysql.jdbc.PreparedStatement.executeBatch(PreparedStatement.java:1082)
at com.mysql.jdbc.jdbc2.optional.StatementWrapper.executeBatch(StatementWrapper.java:719)
at org.jboss.resource.adapter.jdbc.CachedPreparedStatement.executeBatch(CachedPreparedStatement.java:476)
at org.jboss.resource.adapter.jdbc.WrappedStatement.executeBatch(WrappedStatement.java:774)
at org.hibernate.jdbc.BatchingBatcher.doExecuteBatch(BatchingBatcher.java:70)
at org.hibernate.jdbc.AbstractBatcher.executeBatch(AbstractBatcher.java:268)
... 25 more
Jul 13 10:45:31,310 INFO  populateAccountSnapshotReportJob [workerThreadPool-1-1] [org.springframework.batch.core.step.tasklet.TaskletStep] 327 - Commit failed while step execution data was already updated. Reverting to old version.
Jul 13 10:45:31,357 ERROR populateAccountSnapshotReportJob [workerThreadPool-1-1] [org.springframework.batch.core.step.tasklet.TaskletStep] 423 - JobRepository failure forcing exit with unknown status
org.springframework.dao.OptimisticLockingFailureException: Attempt to update step execution id=671991 with wrong version (1), where current version is 2
at org.springframework.batch.core.repository.dao.JdbcStepExecutionDao.updateStepExecution(JdbcStepExecutionDao.java:185)
at org.springframework.batch.core.repository.support.SimpleJobRepository.update(SimpleJobRepository.java:171)
at sun.reflect.GeneratedMethodAccessor143.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:307)
at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:182)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:149)
at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:106)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
at $Proxy77.update(Unknown Source)
at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:417)
at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:128)
at org.springframework.batch.core.step.tasklet.TaskletStep$2.doInChunkContext(TaskletStep.java:263)
at org.springframework.batch.core.scope.context.StepContextRepeatCallback.doInIteration(StepContextRepeatCallback.java:76)
at org.springframework.batch.repeat.support.RepeatTemplate.getNextResult(RepeatTemplate.java:367)
at org.springframework.batch.repeat.support.RepeatTemplate.executeInternal(RepeatTemplate.java:214)
at org.springframework.batch.repeat.support.RepeatTemplate.iterate(RepeatTemplate.java:143)
at org.springframework.batch.core.step.tasklet.TaskletStep.doExecute(TaskletStep.java:248)
at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:195)
at org.springframework.batch.core.partition.support.TaskExecutorPartitionHandler$1.call(TaskExecutorPartitionHandler.java:107)
at org.springframework.batch.core.partition.support.TaskExecutorPartitionHandler$1.call(TaskExecutorPartitionHandler.java:1)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Jul 13 10:45:31,357 INFO  populateAccountSnapshotReportJob [workerThreadPool-1-1] [org.springframework.batch.core.step.tasklet.TaskletStep] 327 - Commit failed while step execution data was already updated. Reverting to old version.
Jul 13 10:45:31,357 ERROR populateAccountSnapshotReportJob [workerThreadPool-1-1] [org.springframework.batch.core.step.AbstractStep] 212 - Encountered an error executing the step
org.springframework.batch.core.step.FatalStepExecutionException: JobRepository failure forcing exit with unknown status
at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:426)
at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:128)
at org.springframework.batch.core.step.tasklet.TaskletStep$2.doInChunkContext(TaskletStep.java:263)
at org.springframework.batch.core.scope.context.StepContextRepeatCallback.doInIteration(StepContextRepeatCallback.java:76)
at org.springframework.batch.repeat.support.RepeatTemplate.getNextResult(RepeatTemplate.java:367)
at org.springframework.batch.repeat.support.RepeatTemplate.executeInternal(RepeatTemplate.java:214)
at org.springframework.batch.repeat.support.RepeatTemplate.iterate(RepeatTemplate.java:143)
at org.springframework.batch.core.step.tasklet.TaskletStep.doExecute(TaskletStep.java:248)
at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:195)
at org.springframework.batch.core.partition.support.TaskExecutorPartitionHandler$1.call(TaskExecutorPartitionHandler.java:107)
at org.springframework.batch.core.partition.support.TaskExecutorPartitionHandler$1.call(TaskExecutorPartitionHandler.java:1)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Caused by: org.springframework.dao.OptimisticLockingFailureException: Attempt to update step execution id=671991 with wrong version (1), where current version is 2
at org.springframework.batch.core.repository.dao.JdbcStepExecutionDao.updateStepExecution(JdbcStepExecutionDao.java:185)
at org.springframework.batch.core.repository.support.SimpleJobRepository.update(SimpleJobRepository.java:171)
at sun.reflect.GeneratedMethodAccessor143.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:307)
at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:182)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:149)
at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:106)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
at $Proxy77.update(Unknown Source)
at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:417)
... 15 more
Jul 13 10:45:31,404 ERROR populateAccountSnapshotReportJob [workerThreadPool-1-1] [org.springframework.batch.core.step.AbstractStep] 250 - Encountered an error saving batch meta data. This job is now in an unknown state and should not be restarted.
org.springframework.dao.OptimisticLockingFailureException: Attempt to update step execution id=671991 with wrong version (1), where current version is 2
at org.springframework.batch.core.repository.dao.JdbcStepExecutionDao.updateStepExecution(JdbcStepExecutionDao.java:185)
at org.springframework.batch.core.repository.support.SimpleJobRepository.update(SimpleJobRepository.java:171)
at sun.reflect.GeneratedMethodAccessor143.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:307)
at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:182)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:149)
at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:106)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
at $Proxy77.update(Unknown Source)
at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:244)
at org.springframework.batch.core.partition.support.TaskExecutorPartitionHandler$1.call(TaskExecutorPartitionHandler.java:107)
at org.springframework.batch.core.partition.support.TaskExecutorPartitionHandler$1.call(TaskExecutorPartitionHandler.java:1)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Jul 13 10:45:31,404 ERROR populateAccountSnapshotReportJob [SimpleAsyncTaskExecutor-2] [org.springframework.batch.core.step.AbstractStep] 212 - Encountered an error executing the step
org.springframework.batch.core.JobExecutionException: Partition handler returned an unsuccessful step
at org.springframework.batch.core.partition.support.PartitionStep.doExecute(PartitionStep.java:110)
at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:195)
at org.springframework.batch.core.job.SimpleStepHandler.handleStep(SimpleStepHandler.java:135)
at org.springframework.batch.core.job.flow.JobFlowExecutor.executeStep(JobFlowExecutor.java:61)
at org.springframework.batch.core.job.flow.support.state.StepState.handle(StepState.java:60)
at org.springframework.batch.core.job.flow.support.SimpleFlow.resume(SimpleFlow.java:144)
at org.springframework.batch.core.job.flow.support.SimpleFlow.start(SimpleFlow.java:124)
at org.springframework.batch.core.job.flow.FlowJob.doExecute(FlowJob.java:135)
at org.springframework.batch.core.job.AbstractJob.execute(AbstractJob.java:281)
at com.om.dh.batch.service.impl.JobExecutionServiceImpl.createAndExecuteJobRunnable(JobExecutionServiceImpl.java:361)
at com.om.dh.batch.service.impl.JobExecutionServiceImpl.doExecuteJob(JobExecutionServiceImpl.java:325)
at com.om.dh.batch.service.impl.JobExecutionServiceImpl.access$000(JobExecutionServiceImpl.java:51)
at com.om.dh.batch.service.impl.JobExecutionServiceImpl$1.run(JobExecutionServiceImpl.java:267)
at java.lang.Thread.run(Thread.java:619)

spring-projects-issues avatar Jul 14 '11 10:07 spring-projects-issues

Dave Syer commented

I think that problem is unavoidable if you don't flush the Hibernate session in your ItemWriter. You have made life complicated by using a partition step, but the same failure mode is easy to see in a single threaded normal step. Can you just do the flush?

spring-projects-issues avatar Jul 22 '11 02:07 spring-projects-issues

Kristof Buts commented

maven project for reproducing the bug

spring-projects-issues avatar May 31 '13 17:05 spring-projects-issues

Kristof Buts commented

In our current project, we are also experiencing this problem in some of our batches, but until yesterday we couldn't reproduce it. After two days trying to reproduce the problem, I finally managed to do it. I already uploaded my little (maven) project for reproducing the bug.

So, what did I figure out? Consider the following scenario for a batch running 3 threads: Thread # | line | What is going on? | oldVersion.version | stepExecution.version | DB version | comments th#1 | 380 | copy(stepExecution, oldVersion); | 1 | 1 | 1 | th#2 | 380 | copy(stepExecution, oldVersion); | 1 | 1 | 1 | th#3 | 380 | copy(stepExecution, oldVersion); | 1 | 1 | 1 | th#1 | 429 | getJobRepository().updateExecutionContext(stepExecution); | 1 | 1 | will become 2 | step execution to be committed th#1 | 430 | stepExecution.incrementCommitCount(); | 1 | 2 | will become 2 | th#1 | / | transaction is commited | 1 | 2 | 2 | th#2 | 429 | getJobRepository().updateExecutionContext(stepExecution); | 1 | 2 | will become 3 | step execution to be committed th#2 | 430 | stepExecution.incrementCommitCount(); | 1 | 3 | will become 3 | th#2 | / | transaction is rolled back | 1 | 2 | 2 | for some reason tx is rolled back th#2 | 344 | copy(oldVersion, stepExecution); | 1 | 1 | 2 | step version is reverted to 1 -> WTF? th#3 | 429 | getJobRepository().updateExecutionContext(stepExecution); | 1 | 1 | 2 | OptimisticLockException will be thrown note: -the line numbers are the ones of class org.springframework.batch.core.step.tasklet.TaskletStep and its inner class ChunkTransactionCallback -the oldVersion.version is the one which is assigned in the current thread. Since I assume all 3 threads to first go through this piece of code, they will be 1 for all 3 threads in this case. At least thread number 2 should have value 1 for his old version to produce the bug. In that case the wrong version (1 instead of 2) will be reverted later on. -the rollback of transaction 2 can be caused by: -a timeout -some trigger that still needs to be executed at commit-time -deferred db constraint checks -execution of some sql statements that were not flushed yet -own registered transaction synchronization that fails -...
-For the bug to occur, it is important that the rollback is triggered after line 422 ((stepExecutionUpdated = true), since that is the line which causes the TransactionSynchronization method to revert the stepExecution object (see if-statement at line 339 - if (stepExecutionUpdated) )

So, the problem is that copying the oldVersion variable is done outside the synchronized block. By putting it inside the block (say at line 407 instead of line 380, just after setting lock to true), I think the bug would be solved (I already tested this by debugging my test with a conditional breakpoint in eclipse which called this code at the right place and it worked).

Is it possible to get this fix in a patched version for spring-batch-core-2.1.8.RELEASE? We cannot upgrade to a higher spring version since we are only one month away from a big release.

Kind regards, Kristof

spring-projects-issues avatar May 31 '13 17:05 spring-projects-issues

Kristof Buts commented

Added the scenario in a more readable form (use monospace font to view it)

spring-projects-issues avatar May 31 '13 17:05 spring-projects-issues

Philippe Mouawad commented

Hello, I am facing the same issue with Spring Batch 2.1.8 and 2.1.9.

Is there some plan to take into account what Kristof Buts.

Stacktrace:

threadPoolTaskExecutorForBatch-12] 12:07:10,363 ERROR org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:438) - JobRepository failure forcing exit with unknown status
org.springframework.dao.OptimisticLockingFailureException: Attempt to update step execution id=8 with wrong version (217), where current version is 223
at org.springframework.batch.core.repository.dao.JdbcStepExecutionDao.updateStepExecution(JdbcStepExecutionDao.java:185)
at org.springframework.batch.core.repository.support.SimpleJobRepository.update(SimpleJobRepository.java:171)
at sun.reflect.GeneratedMethodAccessor802.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:309)
at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:110)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:202)
at com.sun.proxy.$Proxy48.update(Unknown Source)
at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:432)
at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:130)
at org.springframework.batch.core.step.tasklet.TaskletStep$2.doInChunkContext(TaskletStep.java:264)
at org.springframework.batch.core.scope.context.StepContextRepeatCallback.doInIteration(StepContextRepeatCallback.java:76)
at org.springframework.batch.repeat.support.TaskExecutorRepeatTemplate$ExecutingRunnable.run(TaskExecutorRepeatTemplate.java:258)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:662)
[threadPoolTaskExecutorForBatch-12] 12:07:10,366 ERROR com.xxx.dpas.log.Logger.error(Logger.java:89) - ----- doROLLBACK Transaction -----
[threadPoolTaskExecutorForBatch-12] 12:07:10,367 ERROR com.xxx.dpas.log.Logger.error(Logger.java:89) - ----- doCLEANUP Transaction -----
[threadPoolTaskExecutorForBatch-13] 12:07:10,372 ERROR org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:438) - JobRepository failure forcing exit with unknown status
org.springframework.dao.OptimisticLockingFailureException: Attempt to update step execution id=8 with wrong version (221), where current version is 223
at org.springframework.batch.core.repository.dao.JdbcStepExecutionDao.updateStepExecution(JdbcStepExecutionDao.java:185)
at org.springframework.batch.core.repository.support.SimpleJobRepository.update(SimpleJobRepository.java:171)
at sun.reflect.GeneratedMethodAccessor802.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:309)
at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:110)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:202)
at com.sun.proxy.$Proxy48.update(Unknown Source)
at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:432)
at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:130)
at org.springframework.batch.core.step.tasklet.TaskletStep$2.doInChunkContext(TaskletStep.java:264)
at org.springframework.batch.core.scope.context.StepContextRepeatCallback.doInIteration(StepContextRepeatCallback.java:76)
at org.springframework.batch.repeat.support.TaskExecutorRepeatTemplate$ExecutingRunnable.run(TaskExecutorRepeatTemplate.java:258)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:662)

spring-projects-issues avatar Aug 13 '13 05:08 spring-projects-issues

Philippe Mouawad commented

Please find attached the spring batch classes put in INFO and DEBUG fororg.springframework.batch.core.step.tasklet.

As you can see before failure we have this: [threadPoolTaskExecutorForBatch-2] 15:30:46,282 INFO org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.afterCompletion(TaskletStep.java:342) - Commit failed while step execution data was already updated. Reverting to old version.

spring-projects-issues avatar Aug 13 '13 06:08 spring-projects-issues

Philippe Mouawad commented

I have tested with last 2.2.1 and I have the same problem.

spring-projects-issues avatar Aug 13 '13 08:08 spring-projects-issues

Philippe Mouawad commented

Hello, Please find pull request here: https://github.com/SpringSource/spring-batch/pull/214

Hope you can take it into account soon. Regards Philippe

spring-projects-issues avatar Aug 13 '13 13:08 spring-projects-issues

Philippe Mouawad commented

Hello, After further analysis, I don't understand why copy method when restoring also copies the version number of oldVersion to stepExecution ?

Could you shed some light on this ?

Thank you Regards

spring-projects-issues avatar Aug 14 '13 02:08 spring-projects-issues

Michael Minella commented

I added a comment about the pull request. In short, moving the copy to where it is won't work because the StepExecution could have already been modified in the step. There are a couple tests that are failing with that update that illustrate this.

With regards to why the copy method restores the version number...why wouldn't it? The intent is that there was a rollback so the state should match what was there before the rollback.

spring-projects-issues avatar Aug 14 '13 07:08 spring-projects-issues

Philippe Mouawad commented

Thanks for feedback. Further testing this morning on real world project showed me fix did not correct all cases, my real world case is still broken although attached test case was fixed by it.

Does it mean you confirm this bug is a real one ? Test project attached reproduces it and my real world also does. Is there some plan to fix it ? within which delay ?

It seems anyway that the stepexecution is modified somewhere before restoring which generates the case.

spring-projects-issues avatar Aug 14 '13 08:08 spring-projects-issues

Kristof Buts commented

@Michael Minella, I'm afraid there is another (hidden) issue with this code: When running a batch multithreaded, one thread can commit stuff from another thread which can be rollbacked at the end. Shouldn't changes to the stepExecution object be recorded and executed against the actual stepExecution object at commit time instead of instantly applying to the real StepExecution object and reverting on rollback afterwards (just wondering) ?

spring-projects-issues avatar Aug 19 '13 06:08 spring-projects-issues

Gérald Quintana commented

We has this OptimisticLockingFailureException, when we noticed this log just before: "Commit failed while step execution data was already updated. Reverting to old version." . Turning org.springframework.batch.core.step.tasklet and org.springframework.transaction loggers level to debug helped us find a bug in our tasklet code.

In our case:

  • The tasklet produces wrong data but the chunk is property processed and no exception is raised here
  • Chunk transaction commit triggers a deferred database constraint which prevents commit (it may also occur with JPA where transaction commit triggers flushing)
  • This exception is not logged (maybe because of the RetryTemplate even if retry is not configured)
  • The TransactionManager decides to call doRollbackOnCommitException
  • The ChunkTransactionCallback transaction synchronization is called with status=ROLLBACKED

spring-projects-issues avatar Mar 18 '14 10:03 spring-projects-issues

zyro commented

i think i am currently facing this (or at least sth. similar) in the following scenario:

  • business logic throws an OptimisticLockingFailureException during commit marking the current transaction rollback-only
  • jdbcTemplate update count is 0 during step execution update (does this happen if current tx is rollback-only?)
    • https://github.com/spring-projects/spring-batch/blob/master/spring-batch-core/src/main/java/org/springframework/batch/core/repository/dao/JdbcStepExecutionDao.java#L252
  • a little bit misleading exception is thrown: Attempt to update step execution id=123 with wrong version (2), where current version is 1
    • https://github.com/spring-projects/spring-batch/blob/master/spring-batch-core/src/main/java/org/springframework/batch/core/repository/dao/JdbcStepExecutionDao.java#L255
  • stepExecution is stuck with exit status EXECUTING and jobExecution is stuck with status UNKNOWN and exit status UNKNOWN

spring-projects-issues avatar Sep 24 '14 06:09 spring-projects-issues

member sound commented

Is this issue still active? I'm also getting the OptimisticLockFailureException. The cause for me was an insert on a column whose length was not big enough for the inserted data, thus throwing SQL Exception and thereby the rollback:

WARN org.hibernate.engine.jdbc.spi.SqlExceptionHelper: SQL Error: 0, SQLState: 22001
ERROR org.hibernate.engine.jdbc.spi.SqlExceptionHelper: FEHLER: Wert zu lang für Typ character varying(2)
INFO org.hibernate.engine.jdbc.batch.internal.AbstractBatchImpl: HHH000010: On release of batch it still contained JDBC statements
ERROR org.springframework.batch.core.step.tasklet.TaskletStep: JobRepository failure forcing rollback
org.springframework.dao.OptimisticLockingFailureException: Attempt to update step execution id=29 with wrong version (1), where current version is 2
	at org.springframework.batch.core.repository.dao.MapStepExecutionDao.updateStepExecution(MapStepExecutionDao.java:102) ~[spring-batch-core-3.0.2.RELEASE.jar:3.0.2.RELEASE]
	at org.springframework.batch.core.repository.support.SimpleJobRepository.update(SimpleJobRepository.java:191) ~[spring-batch-core-3.0.2.RELEASE.jar:3.0.2.RELEASE]
Caused by: org.springframework.dao.OptimisticLockingFailureException: Attempt to update step execution id=29 with wrong version (1), where current version is 2
	at org.springframework.batch.core.repository.dao.MapStepExecutionDao.updateStepExecution(MapStepExecutionDao.java:102) ~[spring-batch-core-3.0.2.RELEASE.jar:3.0.2.RELEASE]
	at org.springframework.batch.core.repository.support.SimpleJobRepository.update(SimpleJobRepository.java:191) ~[spring-batch-core-3.0.2.RELEASE.jar:3.0.2.RELEASE]
ERROR org.springframework.batch.core.step.AbstractStep: Encountered an error saving batch meta data for step step in job myJobConfig. This job is now in an unknown state and should not be restarted.
org.springframework.dao.OptimisticLockingFailureException: Attempt to update step execution id=29 with wrong version (1), where current version is 2
	at org.springframework.batch.core.repository.dao.MapStepExecutionDao.updateStepExecution(MapStepExecutionDao.java:102) ~[spring-batch-core-3.0.2.RELEASE.jar:3.0.2.RELEASE]
	at org.springframework.batch.core.repository.support.SimpleJobRepository.update(SimpleJobRepository.java:191) ~[spring-batch-core-3.0.2.RELEASE.jar:3.0.2.RELEASE]

spring-projects-issues avatar Dec 12 '14 00:12 spring-projects-issues

Mohamed commented

Hi, I have the same problem, and I don't know how to resolve it. I debuged my code but no way to find the cause.

12:21:42,400 DEBUG @ [FaultTolerantChunkProcessor] Attempting to write: [items=[StagingItem [475472]], skips=[]]
12:21:42,400 DEBUG @ [RetryTemplate] Retry: count=0
12:21:42,402 DEBUG @ [ChunkOrientedTasklet] Inputs not busy, ended: false
12:21:42,402 DEBUG @ [TaskletStep] Applying contribution: [StepContribution: read=1, written=1, filtered=0, readSkips=0, writeSkips=0, processSkips=0, exitStatus=EXECUTING]
12:21:42,402 DEBUG @ [JtaTransactionManager] Participating in existing transaction
12:21:42,404 DEBUG @ [TaskletStep] Saving step execution before commit: StepExecution: id=5026, version=14, name=step1, status=STARTED, exitStatus=EXECUTING, readCount=14, filterCount=0, writeCount=14 readSkipCount=0, writeSkipCount=0, processSkipCount=0, commitCount=14, rollbackCount=0, exitDescription=
12:21:42,404 DEBUG @ [JtaTransactionManager] Participating in existing transaction
12:21:42,409 DEBUG @ [JtaTransactionManager] Participating transaction failed - marking existing transaction as rollback-only
12:21:42,409 DEBUG @ [JtaTransactionManager] Setting JTA transaction rollback-only
12:21:42,412 ERROR @ [TaskletStep] JobRepository failure forcing exit with unknown status
org.springframework.dao.OptimisticLockingFailureException: Attempt to update step execution id=5026 with wrong version (14), where current version is 13

Is there any way to discover the origin of this error? Thanks in advance

spring-projects-issues avatar Jun 22 '15 06:06 spring-projects-issues

Adam commented

Same thing as for @Mohamed. No way to find out what is the root cause of the rollback attempt:

12:28:49.420 [SimpleAsyncTaskExecutor-10] INFO  p.d.i.i.c.IncrementalImportJobFactory - Starting the incremental documents import.
12:51:01.738 [SimpleAsyncTaskExecutor-5] INFO  p.d.i.i.c.IncrementalImportJobFactory - Finished the incremental documents import.
12:51:01.756 [SimpleAsyncTaskExecutor-5] ERROR o.s.b.core.step.tasklet.TaskletStep - JobRepository failure forcing rollback
org.springframework.dao.OptimisticLockingFailureException: Attempt to update step execution id=23580 with wrong version (1), where current version is 2
        at org.springframework.batch.core.repository.dao.JdbcStepExecutionDao.updateStepExecution(JdbcStepExecutionDao.java:255) ~[spring-batch-core-3.0.3.RELEASE.jar:3.0.3.RELEASE]
        at org.springframework.batch.core.repository.support.SimpleJobRepository.update(SimpleJobRepository.java:191) ~[spring-batch-core-3.0.3.RELEASE.jar:3.0.3.RELEASE]
        at sun.reflect.GeneratedMethodAccessor136.invoke(Unknown Source) ~[na:na]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.7.0_55]
        at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_55]
        at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317) ~[spring-aop-4.1.4.RELEASE.jar:4.1.4.RELEASE]
        at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:190) ~[spring-aop-4.1.4.RELEASE.jar:4.1.4.RELEASE]
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157) ~[spring-aop-4.1.4.RELEASE.jar:4.1.4.RELEASE]
        at org.springframework.transaction.interceptor.TransactionInterceptor$1.proceedWithInvocation(TransactionInterceptor.java:99) ~[spring-tx-4.1.4.RELEASE.jar:4.1.4.RELEASE]
        at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:281) ~[spring-tx-4.1.4.RELEASE.jar:4.1.4.RELEASE]
        at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:96) ~[spring-tx-4.1.4.RELEASE.jar:4.1.4.RELEASE]
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) ~[spring-aop-4.1.4.RELEASE.jar:4.1.4.RELEASE]
        at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:207) ~[spring-aop-4.1.4.RELEASE.jar:4.1.4.RELEASE]
        at com.sun.proxy.$Proxy152.update(Unknown Source) ~[na:na]
        at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:452) [spring-batch-core-3.0.3.RELEASE.jar:3.0.3.RELEASE]
        at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:330) [spring-batch-core-3.0.3.RELEASE.jar:3.0.3.RELEASE]
        at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:133) [spring-tx-4.1.4.RELEASE.jar:4.1.4.RELEASE]
        at org.springframework.batch.core.step.tasklet.TaskletStep$2.doInChunkContext(TaskletStep.java:271) [spring-batch-core-3.0.3.RELEASE.jar:3.0.3.RELEASE]
        at org.springframework.batch.core.scope.context.StepContextRepeatCallback.doInIteration(StepContextRepeatCallback.java:77) [spring-batch-core-3.0.3.RELEASE.jar:3.0.3.RELEASE]
        at org.springframework.batch.repeat.support.RepeatTemplate.getNextResult(RepeatTemplate.java:368) [spring-batch-infrastructure-3.0.3.RELEASE.jar:3.0.3.RELEASE]
        at org.springframework.batch.repeat.support.RepeatTemplate.executeInternal(RepeatTemplate.java:215) [spring-batch-infrastructure-3.0.3.RELEASE.jar:3.0.3.RELEASE]
        at org.springframework.batch.repeat.support.RepeatTemplate.iterate(RepeatTemplate.java:144) [spring-batch-infrastructure-3.0.3.RELEASE.jar:3.0.3.RELEASE]
        at org.springframework.batch.core.step.tasklet.TaskletStep.doExecute(TaskletStep.java:257) [spring-batch-core-3.0.3.RELEASE.jar:3.0.3.RELEASE]
        at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:198) [spring-batch-core-3.0.3.RELEASE.jar:3.0.3.RELEASE]
        at org.springframework.batch.core.job.SimpleStepHandler.handleStep(SimpleStepHandler.java:148) [spring-batch-core-3.0.3.RELEASE.jar:3.0.3.RELEASE]
        at org.springframework.batch.core.job.flow.JobFlowExecutor.executeStep(JobFlowExecutor.java:64) [spring-batch-core-3.0.3.RELEASE.jar:3.0.3.RELEASE]
        at org.springframework.batch.core.job.flow.support.state.StepState.handle(StepState.java:67) [spring-batch-core-3.0.3.RELEASE.jar:3.0.3.RELEASE]
        at org.springframework.batch.core.job.flow.support.SimpleFlow.resume(SimpleFlow.java:165) [spring-batch-core-3.0.3.RELEASE.jar:3.0.3.RELEASE]
        at org.springframework.batch.core.job.flow.support.SimpleFlow.start(SimpleFlow.java:144) [spring-batch-core-3.0.3.RELEASE.jar:3.0.3.RELEASE]
        at org.springframework.batch.core.job.flow.support.state.SplitState$1.call(SplitState.java:93) [spring-batch-core-3.0.3.RELEASE.jar:3.0.3.RELEASE]
        at org.springframework.batch.core.job.flow.support.state.SplitState$1.call(SplitState.java:90) [spring-batch-core-3.0.3.RELEASE.jar:3.0.3.RELEASE]
        at java.util.concurrent.FutureTask.run(FutureTask.java:262) [na:1.7.0_55]
        at org.springframework.core.task.SimpleAsyncTaskExecutor$ConcurrencyThrottlingRunnable.run(SimpleAsyncTaskExecutor.java:251) [spring-core-4.1.4.RELEASE.jar:4.1.4.RELEASE]
        at java.lang.Thread.run(Thread.java:745) [na:1.7.0_55]

spring-projects-issues avatar Jul 03 '15 04:07 spring-projects-issues

Gérald Quintana commented

@Spyro @ medhedi_abd The problem occurs, not when the chunk read-process-write fails, but when the commit fails. In the past I've seen 2 reasons for that:

  • Using Hibernate or JPA: when the read-process-write ends, data has not been flushed yet, and the commit triggers the flush and may raise a an exception (SQL constraint not fulfilled...). The solution is to force a flush (call .flush() on entity manager) and the end of the writer.
  • Using Oracle deferred constraints: when the-process-writes ends, SQL constraints have not been checked by Oracle yet, and the commit triggers these SQL constraints verification and may raise an exception (same reason as above). The solution is to replace deferred constraint by immediate constraint. In the end, the commit shouldn't fail, if something goes wrong, it should fail before

spring-projects-issues avatar Jul 03 '15 05:07 spring-projects-issues

Mohamed commented

Hi, @Gerald, I agree with you, Indead, this error occurs also when using Jboss4 EAP. and the treatment of batch works well without any error when using "WebSphere 7". Is there any modification into jboss to fix this problem?, or transformation in spring-batch adapting of jboss transaction?

Thanks in advance

spring-projects-issues avatar Jul 07 '15 01:07 spring-projects-issues

Gérald Quintana commented

@medhedi_Abd In your case it may because of a third reason: "using JTA". If the XA transactions fails to commit a resource for any resource (timeout or whatever).

Having logs of commit failure would help, but it's not the case.

spring-projects-issues avatar Jul 07 '15 07:07 spring-projects-issues

David J. M. Karlsen commented

Any movement on this issue?

spring-projects-issues avatar Oct 13 '15 07:10 spring-projects-issues

Sarvesh Katariya commented

Any updates/workaround on this issue ?

spring-projects-issues avatar Nov 05 '15 21:11 spring-projects-issues

Marcus Mattern commented

We did a simple workaround for us to avoid this problem. I only patched the afterCompletion method of inner class org.springframework.batch.core.step.tasklet.TaskletStep.ChunkTransactionCallback.

{{ @Override public void afterCompletion(int status) { try { if (status != TransactionSynchronization.STATUS_COMMITTED) { if (stepExecutionUpdated) { // Wah! the commit failed. We need to rescue the step // execution data. logger.info("Commit failed while step execution data was already updated. "

  • "Reverting to old version."); //FIX Start // WORKAROUND für https://jira.springsource.org/browse/BATCH-1767 // OptimisticLockingFailureException updating step execution after commit failure // Description: // - save version number of step execution that should be updated // - copy old step details like before // - set new version number instead of old copied number final Integer savedVersion = stepExecution.getVersion(); //FIX End copy(oldVersion, stepExecution); //FIX Start stepExecution.setVersion(savedVersion); //FIX End if (status == TransactionSynchronization.STATUS_ROLLED_BACK) { rollback(stepExecution); } } chunkListener.afterChunkError(chunkContext); }

      		if (status == TransactionSynchronization.STATUS_UNKNOWN) {
      			logger.error("Rolling back with transaction in unknown state");
      			rollback(stepExecution);
      			stepExecution.upgradeStatus(BatchStatus.UNKNOWN);
      			stepExecution.setTerminateOnly();
      		}
      	}
      	finally {
      		// Only release the lock if we acquired it, and release as late
      		// as possible
      		if (locked) {
      			semaphore.release();
      		}
    
      		locked = false;
      	}
      }
    

}}

Since then our problem was gone.

Marcus

spring-projects-issues avatar Nov 06 '15 16:11 spring-projects-issues

member sound commented

Any updates on this fix? I got the same problem...

spring-projects-issues avatar Jan 20 '16 08:01 spring-projects-issues

James Home commented

@Michael

Regarding the comment: "With regards to why the copy method restores the version number...why wouldn't it? The intent is that there was a rollback so the state should match what was there before the rollback."

The thing is, it's not actually a rollback in the database sense. You're really just resetting the counts the their previous values. It's a new update in a db sense, so surely it should use the appropriate version number. The policy of 'rolling back' the version number in a db update seems to be incompatible with the principle of version based optimistic locking that's being used by the dao.

spring-projects-issues avatar Feb 05 '16 09:02 spring-projects-issues

zyro commented

following the suggestion to flush the session in the writer, i think there is still a (slight..) chance for an exception being thrown (e.g. StaleObjectStateException) during the actual commit (which may flush again). when using a skip policy, shouldnt that exception result in a skip and not break the whole job execution leaving it in an UNKNOWN state?

spring-projects-issues avatar Jun 07 '16 12:06 spring-projects-issues