spring-cloud-dataflow icon indicating copy to clipboard operation
spring-cloud-dataflow copied to clipboard

Composed Task Runner intermittently fails to start pods

Open RaickyDerwent opened this issue 4 years ago • 1 comments

Description: Composed Task Runner intermittently fails to start pods when there are multiple parallel flows defined

Versions: SCDF : springcloud/spring-cloud-dataflow-server:2.5.3.RELEASE Composed Task Runner: springcloudtask/composedtaskrunner-task:2.1.4.RELEASE

Issue:

I'm looking at this exception while running a few tasks in parallel. I have 4 tasks configured in parallel. One of them fails to start throwing this exception in Composed Task Runner:

2021-09-22 10:20:05.663  INFO 1 --- [ taskExecutor-2] o.s.batch.core.job.SimpleStepHandler     : Executing step: [fi-prepaid-fi-dp2-gen_0]
2021-09-22 10:20:05.788  INFO 1 --- [ taskExecutor-1] o.s.batch.core.job.SimpleStepHandler     : Executing step: [fi-prepaid-fi-dp1-gen_0]
2021-09-22 10:20:17.076 ERROR 1 --- [ taskExecutor-2] o.s.batch.core.step.AbstractStep         : Encountered an error executing step fi-prepaid-fi-dp2-gen_0 in job fi-prepaid

org.springframework.cloud.dataflow.rest.client.DataFlowClientException: Operation: [create]  for kind: [Pod]  with name: [null]  in namespace: [vee-qa]  failed.
	at org.springframework.cloud.dataflow.rest.client.VndErrorResponseErrorHandler.handleError(VndErrorResponseErrorHandler.java:65) ~[spring-cloud-dataflow-rest-client-2.2.1.RELEASE.jar!/:2.2.1.RELEASE]
	at org.springframework.web.client.ResponseErrorHandler.handleError(ResponseErrorHandler.java:63) ~[spring-web-5.1.14.RELEASE.jar!/:5.1.14.RELEASE]
	at org.springframework.web.client.RestTemplate.handleResponse(RestTemplate.java:776) ~[spring-web-5.1.14.RELEASE.jar!/:5.1.14.RELEASE]
	at org.springframework.web.client.RestTemplate.doExecute(RestTemplate.java:734) ~[spring-web-5.1.14.RELEASE.jar!/:5.1.14.RELEASE]
	at org.springframework.web.client.RestTemplate.execute(RestTemplate.java:668) ~[spring-web-5.1.14.RELEASE.jar!/:5.1.14.RELEASE]
	at org.springframework.web.client.RestTemplate.postForObject(RestTemplate.java:412) ~[spring-web-5.1.14.RELEASE.jar!/:5.1.14.RELEASE]
	at org.springframework.cloud.dataflow.rest.client.TaskTemplate.launch(TaskTemplate.java:169) ~[spring-cloud-dataflow-rest-client-2.2.1.RELEASE.jar!/:2.2.1.RELEASE]
	at org.springframework.cloud.task.app.composedtaskrunner.TaskLauncherTasklet.execute(TaskLauncherTasklet.java:138) ~[spring-cloud-starter-task-composedtaskrunner-2.1.4.RELEASE.jar!/:2.1.4.RELEASE]
	at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:407) ~[spring-batch-core-4.1.3.RELEASE.jar!/:4.1.3.RELEASE]
	at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:331) ~[spring-batch-core-4.1.3.RELEASE.jar!/:4.1.3.RELEASE]
	at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:140) ~[spring-tx-5.1.14.RELEASE.jar!/:5.1.14.RELEASE]
	at org.springframework.batch.core.step.tasklet.TaskletStep$2.doInChunkContext(TaskletStep.java:273) ~[spring-batch-core-4.1.3.RELEASE.jar!/:4.1.3.RELEASE]
	at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:331) ~[spring-batch-core-4.1.3.RELEASE.jar!/:4.1.3.RELEASE]
	at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:140) ~[spring-tx-5.1.14.RELEASE.jar!/:5.1.14.RELEASE]
	at org.springframework.batch.core.step.tasklet.TaskletStep$2.doInChunkContext(TaskletStep.java:273) ~[spring-batch-core-4.1.3.RELEASE.jar!/:4.1.3.RELEASE]
	at org.springframework.batch.core.scope.context.StepContextRepeatCallback.doInIteration(StepContextRepeatCallback.java:82) ~[spring-batch-core-4.1.3.RELEASE.jar!/:4.1.3.RELEASE]
	at org.springframework.batch.repeat.support.RepeatTemplate.getNextResult(RepeatTemplate.java:375) ~[spring-batch-infrastructure-4.1.3.RELEASE.jar!/:4.1.3.RELEASE]
	at org.springframework.batch.repeat.support.RepeatTemplate.executeInternal(RepeatTemplate.java:215) ~[spring-batch-infrastructure-4.1.3.RELEASE.jar!/:4.1.3.RELEASE]
	at org.springframework.batch.repeat.support.RepeatTemplate.iterate(RepeatTemplate.java:145) ~[spring-batch-infrastructure-4.1.3.RELEASE.jar!/:4.1.3.RELEASE]
	at org.springframework.batch.core.step.tasklet.TaskletStep.doExecute(TaskletStep.java:258) ~[spring-batch-core-4.1.3.RELEASE.jar!/:4.1.3.RELEASE]
	at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:203) ~[spring-batch-core-4.1.3.RELEASE.jar!/:4.1.3.RELEASE]
	at org.springframework.batch.core.job.SimpleStepHandler.handleStep(SimpleStepHandler.java:148) [spring-batch-core-4.1.3.RELEASE.jar!/:4.1.3.RELEASE]
	at org.springframework.batch.core.job.flow.JobFlowExecutor.executeStep(JobFlowExecutor.java:68) [spring-batch-core-4.1.3.RELEASE.jar!/:4.1.3.RELEASE]
	at org.springframework.batch.core.job.flow.support.state.StepState.handle(StepState.java:67) [spring-batch-core-4.1.3.RELEASE.jar!/:4.1.3.RELEASE]
	at org.springframework.batch.core.job.flow.support.SimpleFlow.resume(SimpleFlow.java:169) [spring-batch-core-4.1.3.RELEASE.jar!/:4.1.3.RELEASE]
	at org.springframework.batch.core.job.flow.support.SimpleFlow.start(SimpleFlow.java:144) [spring-batch-core-4.1.3.RELEASE.jar!/:4.1.3.RELEASE]
	at org.springframework.batch.core.job.flow.support.state.FlowState.handle(FlowState.java:56) [spring-batch-core-4.1.3.RELEASE.jar!/:4.1.3.RELEASE]
	at org.springframework.batch.core.job.flow.support.SimpleFlow.resume(SimpleFlow.java:169) [spring-batch-core-4.1.3.RELEASE.jar!/:4.1.3.RELEASE]
	at org.springframework.batch.core.job.flow.support.SimpleFlow.start(SimpleFlow.java:144) [spring-batch-core-4.1.3.RELEASE.jar!/:4.1.3.RELEASE]
	at org.springframework.batch.core.job.flow.support.state.SplitState$1.call(SplitState.java:94) [spring-batch-core-4.1.3.RELEASE.jar!/:4.1.3.RELEASE]
	at org.springframework.batch.core.job.flow.support.state.SplitState$1.call(SplitState.java:91) [spring-batch-core-4.1.3.RELEASE.jar!/:4.1.3.RELEASE]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_192]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_192]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_192]
	at java.lang.Thread.run(Thread.java:748) [na:1.8.0_192]
	
	
2021-09-22 10:20:17.078  INFO 1 --- [ taskExecutor-2] .t.a.c.ComposedTaskStepExecutionListener : AfterStep processing for stepExecution fi-prepaid-fi-dp2-gen_0
2021-09-22 10:20:17.078 ERROR 1 --- [ taskExecutor-2] o.s.batch.core.step.AbstractStep         : Exception in afterStep callback in step fi-prepaid-fi-dp2-gen_0 in job fi-prepaid

java.lang.IllegalArgumentException: TaskLauncherTasklet did not return a task-execution-id.  Check to see if task exists.
	at org.springframework.util.Assert.notNull(Assert.java:198) ~[spring-core-5.1.14.RELEASE.jar!/:5.1.14.RELEASE]
	at org.springframework.cloud.task.app.composedtaskrunner.ComposedTaskStepExecutionListener.afterStep(ComposedTaskStepExecutionListener.java:65) ~[spring-cloud-starter-task-composedtaskrunner-2.1.4.RELEASE.jar!/:2.1.4.RELEASE]
	at org.springframework.batch.core.listener.CompositeStepExecutionListener.afterStep(CompositeStepExecutionListener.java:62) ~[spring-batch-core-4.1.3.RELEASE.jar!/:4.1.3.RELEASE]
	at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:242) ~[spring-batch-core-4.1.3.RELEASE.jar!/:4.1.3.RELEASE]
	at org.springframework.batch.core.job.SimpleStepHandler.handleStep(SimpleStepHandler.java:148) [spring-batch-core-4.1.3.RELEASE.jar!/:4.1.3.RELEASE]
	at org.springframework.batch.core.job.flow.JobFlowExecutor.executeStep(JobFlowExecutor.java:68) [spring-batch-core-4.1.3.RELEASE.jar!/:4.1.3.RELEASE]
	at org.springframework.batch.core.job.flow.support.state.StepState.handle(StepState.java:67) [spring-batch-core-4.1.3.RELEASE.jar!/:4.1.3.RELEASE]
	at org.springframework.batch.core.job.flow.support.SimpleFlow.resume(SimpleFlow.java:169) [spring-batch-core-4.1.3.RELEASE.jar!/:4.1.3.RELEASE]
	at org.springframework.batch.core.job.flow.support.SimpleFlow.start(SimpleFlow.java:144) [spring-batch-core-4.1.3.RELEASE.jar!/:4.1.3.RELEASE]
	at org.springframework.batch.core.job.flow.support.state.FlowState.handle(FlowState.java:56) [spring-batch-core-4.1.3.RELEASE.jar!/:4.1.3.RELEASE]
	at org.springframework.batch.core.job.flow.support.state.StepState.handle(StepState.java:67) [spring-batch-core-4.1.3.RELEASE.jar!/:4.1.3.RELEASE]
	at org.springframework.batch.core.job.flow.support.SimpleFlow.resume(SimpleFlow.java:169) [spring-batch-core-4.1.3.RELEASE.jar!/:4.1.3.RELEASE]
	at org.springframework.batch.core.job.flow.support.SimpleFlow.start(SimpleFlow.java:144) [spring-batch-core-4.1.3.RELEASE.jar!/:4.1.3.RELEASE]
	at org.springframework.batch.core.job.flow.support.state.FlowState.handle(FlowState.java:56) [spring-batch-core-4.1.3.RELEASE.jar!/:4.1.3.RELEASE]
	at org.springframework.batch.core.job.flow.support.SimpleFlow.resume(SimpleFlow.java:169) [spring-batch-core-4.1.3.RELEASE.jar!/:4.1.3.RELEASE]
	at org.springframework.batch.core.job.flow.support.SimpleFlow.start(SimpleFlow.java:144) [spring-batch-core-4.1.3.RELEASE.jar!/:4.1.3.RELEASE]
	at org.springframework.batch.core.job.flow.support.state.SplitState$1.call(SplitState.java:94) [spring-batch-core-4.1.3.RELEASE.jar!/:4.1.3.RELEASE]
	at org.springframework.batch.core.job.flow.support.state.SplitState$1.call(SplitState.java:91) [spring-batch-core-4.1.3.RELEASE.jar!/:4.1.3.RELEASE]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_192]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_192]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_192]
	at java.lang.Thread.run(Thread.java:748) [na:1.8.0_192]

Any pointers on what could be going wrong here ?

RaickyDerwent avatar Sep 22 '21 10:09 RaickyDerwent

Have you tried latest dataflow release 2.8.1? We have moved ctr to get build with dataflow itself(to better align with api's) and we've fixed some issues around what you posted.

jvalkeal avatar Sep 22 '21 12:09 jvalkeal

Closing for lack of feedback, thanks for reporting. If there are any additional problems, please open a new issue.

markpollack avatar Sep 22 '22 12:09 markpollack