Timeout when querying composed-task executions
I am encountering a timeout issue when querying task executions in a composed-task setup. This seems to be related to the usage of a UNION ALL in a view table and the way the queries are joined.
From reviewing the code, it appears that the following part in the JdbcAggregateJobQueryDao class might be causing the issue:
java
private static final String GET_COUNT_BY_TASK_EXECUTION_ID = "SELECT COUNT(T.TASK_EXECUTION_ID) FROM AGGREGATE_JOB_EXECUTION E" +
" JOIN AGGREGATE_TASK_BATCH B ON E.JOB_EXECUTION_ID = B.JOB_EXECUTION_ID AND E.SCHEMA_TARGET = B.SCHEMA_TARGET" +
" JOIN AGGREGATE_TASK_EXECUTION T ON B.TASK_EXECUTION_ID = T.TASK_EXECUTION_ID AND B.SCHEMA_TARGET = T.SCHEMA_TARGET" +
" WHERE T.TASK_EXECUTION_ID = ? AND T.SCHEMA_TARGET = ?";
It seems like the AGGREGATE_TASK_EXECUTION view is created using UNION ALL, which combines two tables (boot2_task_execution and boot3_task_execution). My concern is that the timeout might be occurring due to this view when joined with other tables during execution queries.
Question: Is this a known issue? Are there any plans to resolve this timeout issue in future releases?
Additionally, I am curious about the reasoning behind separating the boot2 and boot3 tables and querying them together using a view. If there isn't a specific reason for this, I am considering modifying the application to bypass the view and directly query the boot2 and boot3 tables, depending on the target schema. What are your thoughts on this approach?
Hello @seonuk , Could you provide us the following information?
- Spring Cloud Data Flow Version
- Database type
SCDF supports Spring Batch 4 / 5 as well as Spring Cloud Task 2 / 3 schemas. This is done by maintaining the tables for each of the schemas. The goal of the view is to provide a unified search capability for those schemas.
Hello @cppwfs
Here is the requested information:
- Spring Cloud Data Flow Version: 2.11.5
- Database Type: MySQL 8.0.28 (InnoDB)
It seems that slow queries are occurring frequently because the view table does not utilize indexes effectively during unified searches. We are currently using the GET /tasks/executions API, and to address the slow query issue, I am modifying the JdbcAggregateJobQueryDao class to directly query the rows from the relevant tables instead of querying the view. Here’s an example of the changes I’m making:
...
private static final String FROM_CLAUSE_TASK_EXEC_BATCH_BOOT2 = "JOIN TASK_TASK_BATCH B ON E.JOB_EXECUTION_ID = B.JOB_EXECUTION_ID" +
" JOIN TASK_EXECUTION T ON B.TASK_EXECUTION_ID = T.TASK_EXECUTION_ID";
...
byJobExecutionIdAndSchemaPagingQueryProviderBoot2 = getPagingQueryProvider(
FIELDS_WITH_STEP_COUNT_BOOT2,
FROM_CLAUSE_TASK_EXEC_BATCH_BOOT2,
FIND_BY_ID_SCHEMA
);
byJobExecutionIdAndSchemaPagingQueryProviderBoot3 = getPagingQueryProvider(
FIELDS_WITH_STEP_COUNT_BOOT3,
FROM_CLAUSE_TASK_EXEC_BATCH_BOOT3,
FIND_BY_ID_SCHEMA
);
...
private List<TaskJobExecution> getJobExecutionPage(long jobExecutionId, String schemaTarget) {
if ("boot2".equals(schemaTarget)) {
return queryForProvider(
byJobExecutionIdAndSchemaPagingQueryProviderBoot2,
new JobExecutionRowMapper(true),
0,
2,
jobExecutionId,
schemaTarget
);
}
return queryForProvider(
byJobExecutionIdAndSchemaPagingQueryProviderBoot3,
new JobExecutionRowMapper(true),
0,
2,
jobExecutionId,
schemaTarget
);
}
...
This approach queries the rows directly from the corresponding tables (TASK_TASK_BATCH, TASK_EXECUTION) based on the schema target (boot2 or boot3), rather than going through the view.
If you have any suggestions or improvements, feel free to let me know!
@seonuk The tasks/thinexecutions will provide much better performance because it only provides the top-level info with a link to the detail as required. The different is about 30:1 between the 2 APIs on MySQL8.
The response org.springframework.cloud.dataflow.rest.resource.TaskExecutionThinResource with a self link to the same as the self link in tasks/executions
Closing this issue due to inactivity. If this has been closed in error please leave a comment letting us know to reopen it. Thank you.