starrocks icon indicating copy to clipboard operation
starrocks copied to clipboard

[Enhancement] Add more informations for information_schema.task_runs

Open LiShuMing opened this issue 6 months ago • 5 comments

Why I'm doing:

  1. We cannot figure out a task run's pending time which's the time between a task run creating and starting to process from information_schema.task_runs or information_schema.materialized_views;
  2. For mv refresh tasks, one mv refresh may trigger multi task tasks which each one process only one to-refresh partition, but we cannot distinguish which task runs are belonged to the same job (the mv refresh task).

so I added more informations about those two situations.

What I'm doing:

This pull request introduces enhancements to the schema scanner and materialized views system, focusing on adding new fields, improving data type consistency, and refining the handling of materialized view refresh statuses. The changes span multiple files and primarily affect the backend schema scanner, frontend catalog system, and materialized view status representation.

Backend Schema Scanner Enhancements:

  • Added new columns LAST_REFRESH_PROCESS_TIME and LAST_REFRESH_JOB_ID to the SchemaMaterializedViewsScanner and updated the fill_chunk method to include these fields. [1] [2]
  • Added new columns JOB_ID, JOB_STATE, and PROCESS_TIME to the SchemaTaskRunsScanner, with corresponding updates to the fill_chunk method to handle these fields. [1] [2]

Frontend Catalog System Updates:

  • Updated MaterializedViewsSystemTable to include new columns (LAST_REFRESH_PROCESS_TIME, LAST_REFRESH_JOB_ID, etc.) and changed data types for several existing columns (e.g., MATERIALIZED_VIEW_ID to BIGINT, LAST_REFRESH_DURATION to DOUBLE). [1] [2]
  • Enhanced TaskRunsSystemTable by adding JOB_ID, JOB_STATE, and PROCESS_TIME columns, along with corresponding data type adjustments. [1] [2]

Materialized View Status Improvements:

  • Added fields jobId and mvRefreshProcessTime to ShowMaterializedViewStatus to track job-specific and process-related information. [1] [2] [3]
  • Updated methods in ShowMaterializedViewStatus to populate and display the new fields (jobId, mvRefreshProcessTime) in both thrift and result set representations. [1] [2] [3]

Data Type Consistency:

  • Standardized data types in ShowMaterializedViewsStmt and MaterializedViewsSystemTable, converting several fields (e.g., id, task_id, rows) to BIGINT and others (e.g., last_refresh_duration) to more appropriate types like DOUBLE or DATETIME. [1] [2]

TaskRun Status Refinement:

  • Enhanced TaskRunStatus to better handle refresh states for materialized views, including distinguishing between running and finished states. [1] [2]

Fixes #issue

What type of PR is this:

  • [ ] BugFix
  • [ ] Feature
  • [x] Enhancement
  • [ ] Refactor
  • [ ] UT
  • [ ] Doc
  • [ ] Tool

Does this PR entail a change in behavior?

  • [x] Yes, this PR will result in a change in behavior.
  • [ ] No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • [x] Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • [ ] Parameter changes: default values, similar parameters but with different default values
  • [ ] Policy changes: use new policy to replace old one, functionality automatically enabled
  • [ ] Feature removed
  • [ ] Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • [x] I have added test cases for my bug fix or my new feature
  • [ ] This pr needs user documentation (for new or modified features or behaviors)
    • [ ] I have added documentation for my new feature or new function
  • [ ] This is a backport pr

Bugfix cherry-pick branch check:

  • [x] I have checked the version labels which the pr will be auto-backported to the target branch
    • [ ] 3.5
    • [ ] 3.4
    • [ ] 3.3

LiShuMing avatar Jun 19 '25 04:06 LiShuMing

@LiShuMing, this will change user interface, please don't cherry-pick to v3.5

alvin-celerdata avatar Jun 19 '25 06:06 alvin-celerdata

[Java-Extensions Incremental Coverage Report]

:white_check_mark: pass : 0 / 0 (0%)

github-actions[bot] avatar Jun 21 '25 04:06 github-actions[bot]

[FE Incremental Coverage Report]

:x: fail : 121 / 155 (78.06%)

file detail

path covered_line new_line coverage not_covered_line_detail
:large_blue_circle: com/starrocks/qe/ShowExecutor.java 1 3 33.33% [749, 750]
:large_blue_circle: com/starrocks/qe/ShowMaterializedViewStatus.java 75 106 70.75% [332, 373, 374, 375, 775, 778, 779, 780, 781, 783, 784, 785, 786, 787, 788, 790, 791, 792, 793, 794, 795, 803, 804, 805, 806, 807, 816, 817, 818, 819, 820]
:large_blue_circle: com/starrocks/scheduler/persist/TaskRunStatus.java 15 16 93.75% [162]
:large_blue_circle: com/starrocks/sql/ast/ShowMaterializedViewsStmt.java 8 8 100.00% []
:large_blue_circle: com/starrocks/sql/plan/PlanFragmentBuilder.java 2 2 100.00% []
:large_blue_circle: com/starrocks/catalog/system/information/MaterializedViewsSystemTable.java 15 15 100.00% []
:large_blue_circle: com/starrocks/catalog/system/information/TaskRunsSystemTable.java 4 4 100.00% []
:large_blue_circle: com/starrocks/scheduler/TaskRun.java 1 1 100.00% []

github-actions[bot] avatar Jun 21 '25 04:06 github-actions[bot]

[BE Incremental Coverage Report]

:x: fail : 1 / 22 (04.55%)

file detail

path covered_line new_line coverage not_covered_line_detail
:large_blue_circle: be/src/exec/schema_scanner/schema_materialized_views_scanner.cpp 0 2 00.00% [112, 115]
:large_blue_circle: be/src/exec/schema_scanner/schema_task_runs_scanner.cpp 1 20 05.00% [277, 279, 280, 281, 282, 283, 284, 286, 288, 289, 290, 291, 292, 293, 294, 296, 297, 298, 301]

github-actions[bot] avatar Jun 21 '25 04:06 github-actions[bot]