spring-cloud-dataflow icon indicating copy to clipboard operation
spring-cloud-dataflow copied to clipboard

Timeout when querying composed-task executions

Open seonuk opened this issue 1 year ago • 3 comments

I am encountering a timeout issue when querying task executions in a composed-task setup. This seems to be related to the usage of a UNION ALL in a view table and the way the queries are joined.

From reviewing the code, it appears that the following part in the JdbcAggregateJobQueryDao class might be causing the issue:

java


private static final String GET_COUNT_BY_TASK_EXECUTION_ID = "SELECT COUNT(T.TASK_EXECUTION_ID) FROM AGGREGATE_JOB_EXECUTION E" +
    " JOIN AGGREGATE_TASK_BATCH B ON E.JOB_EXECUTION_ID = B.JOB_EXECUTION_ID AND E.SCHEMA_TARGET = B.SCHEMA_TARGET" +
    " JOIN AGGREGATE_TASK_EXECUTION T ON B.TASK_EXECUTION_ID = T.TASK_EXECUTION_ID AND B.SCHEMA_TARGET = T.SCHEMA_TARGET" +
    " WHERE T.TASK_EXECUTION_ID = ? AND T.SCHEMA_TARGET = ?";

It seems like the AGGREGATE_TASK_EXECUTION view is created using UNION ALL, which combines two tables (boot2_task_execution and boot3_task_execution). My concern is that the timeout might be occurring due to this view when joined with other tables during execution queries.

Question: Is this a known issue? Are there any plans to resolve this timeout issue in future releases?

Additionally, I am curious about the reasoning behind separating the boot2 and boot3 tables and querying them together using a view. If there isn't a specific reason for this, I am considering modifying the application to bypass the view and directly query the boot2 and boot3 tables, depending on the target schema. What are your thoughts on this approach?

seonuk avatar Oct 07 '24 07:10 seonuk

Hello @seonuk , Could you provide us the following information?

  • Spring Cloud Data Flow Version
  • Database type

SCDF supports Spring Batch 4 / 5 as well as Spring Cloud Task 2 / 3 schemas. This is done by maintaining the tables for each of the schemas. The goal of the view is to provide a unified search capability for those schemas.

cppwfs avatar Oct 08 '24 12:10 cppwfs

Hello @cppwfs

Here is the requested information:

  • Spring Cloud Data Flow Version: 2.11.5
  • Database Type: MySQL 8.0.28 (InnoDB)

It seems that slow queries are occurring frequently because the view table does not utilize indexes effectively during unified searches. We are currently using the GET /tasks/executions API, and to address the slow query issue, I am modifying the JdbcAggregateJobQueryDao class to directly query the rows from the relevant tables instead of querying the view. Here’s an example of the changes I’m making:

...

private static final String FROM_CLAUSE_TASK_EXEC_BATCH_BOOT2 = "JOIN TASK_TASK_BATCH B ON E.JOB_EXECUTION_ID = B.JOB_EXECUTION_ID" +
    " JOIN TASK_EXECUTION T ON B.TASK_EXECUTION_ID = T.TASK_EXECUTION_ID";


...

byJobExecutionIdAndSchemaPagingQueryProviderBoot2 = getPagingQueryProvider(
    FIELDS_WITH_STEP_COUNT_BOOT2,
    FROM_CLAUSE_TASK_EXEC_BATCH_BOOT2,
    FIND_BY_ID_SCHEMA
);
byJobExecutionIdAndSchemaPagingQueryProviderBoot3 = getPagingQueryProvider(
    FIELDS_WITH_STEP_COUNT_BOOT3,
    FROM_CLAUSE_TASK_EXEC_BATCH_BOOT3,
    FIND_BY_ID_SCHEMA
);

...
private List<TaskJobExecution> getJobExecutionPage(long jobExecutionId, String schemaTarget) {
    if ("boot2".equals(schemaTarget)) {
        return queryForProvider(
            byJobExecutionIdAndSchemaPagingQueryProviderBoot2,
            new JobExecutionRowMapper(true),
            0,
            2,
            jobExecutionId,
            schemaTarget
        );
    }
    return queryForProvider(
        byJobExecutionIdAndSchemaPagingQueryProviderBoot3,
        new JobExecutionRowMapper(true),
        0,
        2,
        jobExecutionId,
        schemaTarget
    );
}
...

This approach queries the rows directly from the corresponding tables (TASK_TASK_BATCH, TASK_EXECUTION) based on the schema target (boot2 or boot3), rather than going through the view.

If you have any suggestions or improvements, feel free to let me know!

seonuk avatar Oct 10 '24 05:10 seonuk

@seonuk The tasks/thinexecutions will provide much better performance because it only provides the top-level info with a link to the detail as required. The different is about 30:1 between the 2 APIs on MySQL8.

The response org.springframework.cloud.dataflow.rest.resource.TaskExecutionThinResource with a self link to the same as the self link in tasks/executions

corneil avatar Oct 14 '24 10:10 corneil

Closing this issue due to inactivity. If this has been closed in error please leave a comment letting us know to reopen it. Thank you.

cppwfs avatar Nov 12 '24 14:11 cppwfs