starrocks icon indicating copy to clipboard operation
starrocks copied to clipboard

[Enhancement] Support table_write_log statistic table

Open meegoo opened this issue 1 month ago โ€ข 15 comments

Why I'm doing:

To provide visibility into tablet write operations (including load and compaction) for monitoring and analyzing write amplification in the storage layer. Without this feature, it is difficult to understand the I/O behavior of tablets and diagnose write amplification issues that can impact storage efficiency and performance.

What I'm doing:

  • Add a new be_tablet_write_log system table in information_schema to expose tablet write operations Implement TabletWriteLogManager in BE to capture and buffer write logs from both load and compaction tasks, with configurable buffer size and retention time
  • Create SchemaBeTabletWriteLogScanner to scan write log entries from memory buffer
  • Add TabletWriteLogHistorySyncer in FE to periodically sync write log data from all BEs to a persistent history table (statistics.table_write_log)
  • Extend CompactRequest protobuf with table_id and partition_id fields to enable complete tracking of compaction operations
  • Add new configuration options: enable_tablet_write_log, tablet_write_log_buffer_size (default 100000), and tablet_write_log_retention_time_ms (default 30 minutes)
  • The write log tracks key metrics including: txn_id, tablet_id, table_id, partition_id, log_type (LOAD/COMPACTION), input/output rows and bytes, segment counts, and compaction metadata

Fixes #issue

What type of PR is this:

  • [ ] BugFix
  • [ ] Feature
  • [x] Enhancement
  • [ ] Refactor
  • [ ] UT
  • [ ] Doc
  • [ ] Tool

Does this PR entail a change in behavior?

  • [ ] Yes, this PR will result in a change in behavior.
  • [x] No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • [ ] Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • [ ] Parameter changes: default values, similar parameters but with different default values
  • [ ] Policy changes: use new policy to replace old one, functionality automatically enabled
  • [ ] Feature removed
  • [ ] Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • [x] I have added test cases for my bug fix or my new feature
  • [ ] This pr needs user documentation (for new or modified features or behaviors)
    • [ ] I have added documentation for my new feature or new function
  • [ ] This is a backport pr

Bugfix cherry-pick branch check:

  • [x] I have checked the version labels which the pr will be auto-backported to the target branch
    • [ ] 4.0
    • [ ] 3.5
    • [ ] 3.4
    • [ ] 3.3

[!NOTE] Adds in-memory tablet write logs for LOAD/COMPACTION, exposes them via information_schema.be_tablet_write_log, wires logging into lake writer/compaction, and periodically syncs to a persisted history table.

  • Backend (BE):
    • Write Log Manager: Implement lake/TabletWriteLogManager (in-memory ring buffer with filters/cleanup) and expose via new schema scanner SchemaBeTabletWriteLogScanner; register in schema_scanner.cpp and CMake.
    • Lake Integration: Record write logs in DeltaWriter (LOAD) and Horizontal/VerticalCompactionTask (COMPACTION), including begin/finish times, rows/bytes, segments, label/type.
    • Compaction Context: Extend CompactionTaskContext with table_id/partition_id; resolve from StarOS in TabletManager::compact.
    • Metrics: Track input_bytes in DeltaWriterStat and capture writer open time.
    • Config: Add enable_tablet_write_log and tablet_write_log_buffer_size.
  • Frontend (FE):
    • Info Schema: Add information_schema.be_tablet_write_log (BeTabletWriteLogSystemTable) and register in InfoSchemaDb; add ID in SystemId.
    • History Sync: Add TabletWriteLogHistorySyncer and hook into TableKeeper and GlobalStateMgr to periodically insert into _statistics_.tablet_write_log_history.
  • Thrift: Add SCH_BE_TABLET_WRITE_LOG enum.
  • Tests:
    • Add BE unit tests for TabletWriteLogManager and schema scanner; update CMake to build them.

Written by Cursor Bugbot for commit 04e861113e9b921a8307b5250a8e42ee648b63d8. This will update automatically on new commits. Configure here.

meegoo avatar Nov 25 '25 01:11 meegoo

@cursor review

alvin-celerdata avatar Nov 25 '25 02:11 alvin-celerdata

@cursor review

alvin-celerdata avatar Nov 25 '25 04:11 alvin-celerdata

@cursor review

alvin-celerdata avatar Nov 25 '25 17:11 alvin-celerdata

@cursor review

alvin-celerdata avatar Dec 01 '25 19:12 alvin-celerdata

๐Ÿงช CI Insights

Here's what we observed from your CI run for c4bf5a7e.

๐ŸŸข All jobs passed!

But CI Insights is watching ๐Ÿ‘€

mergify[bot] avatar Dec 10 '25 09:12 mergify[bot]

@cursor review

alvin-celerdata avatar Dec 10 '25 17:12 alvin-celerdata

๐Ÿšจ Bugbot couldn't run

Something went wrong. Try again by commenting "Cursor review" or "bugbot run", or contact support (requestId: serverGenReqId_e062f390-5011-4557-9bc7-fe20c0e4d623).

cursor[bot] avatar Dec 10 '25 17:12 cursor[bot]

@cursor review

alvin-celerdata avatar Dec 11 '25 15:12 alvin-celerdata

@cursor review

alvin-celerdata avatar Dec 11 '25 21:12 alvin-celerdata

@cursor review

alvin-celerdata avatar Dec 12 '25 04:12 alvin-celerdata

@cursor review

alvin-celerdata avatar Dec 12 '25 07:12 alvin-celerdata

@cursor review

alvin-celerdata avatar Dec 12 '25 17:12 alvin-celerdata

@cursor review

alvin-celerdata avatar Dec 15 '25 17:12 alvin-celerdata

[Java-Extensions Incremental Coverage Report]

:white_check_mark: pass : 0 / 0 (0%)

github-actions[bot] avatar Dec 16 '25 02:12 github-actions[bot]

[FE Incremental Coverage Report]

:white_check_mark: pass : 46 / 52 (88.46%)

file detail

path covered_line new_line coverage not_covered_line_detail
:large_blue_circle: com/starrocks/lake/TabletWriteLogHistorySyncer.java 22 27 81.48% [96, 97, 105, 106, 110]
:large_blue_circle: com/starrocks/catalog/system/information/BeTabletWriteLogSystemTable.java 20 21 95.24% [27]
:large_blue_circle: com/starrocks/catalog/system/information/InfoSchemaDb.java 1 1 100.00% []
:large_blue_circle: com/starrocks/server/GlobalStateMgr.java 2 2 100.00% []
:large_blue_circle: com/starrocks/scheduler/history/TableKeeper.java 1 1 100.00% []

github-actions[bot] avatar Dec 16 '25 02:12 github-actions[bot]

[BE Incremental Coverage Report]

:white_check_mark: pass : 199 / 235 (84.68%)

file detail

path covered_line new_line coverage not_covered_line_detail
:large_blue_circle: be/src/storage/lake/tablet_manager.cpp 1 20 05.00% [75, 76, 78, 79, 80, 81, 84, 85, 86, 87, 90, 1119, 1120, 1121, 1122, 1123, 1125, 1126, 1129]
:large_blue_circle: be/src/storage/lake/horizontal_compaction_task.cpp 3 8 37.50% [153, 154, 155, 156, 158]
:large_blue_circle: be/src/storage/lake/vertical_compaction_task.cpp 3 8 37.50% [125, 126, 127, 128, 130]
:large_blue_circle: be/src/storage/lake/delta_writer.cpp 3 7 42.86% [751, 752, 753, 755]
:large_blue_circle: be/src/exec/schema_scanner/schema_be_tablet_write_log_scanner.cpp 93 96 96.88% [64, 67, 126]
:large_blue_circle: be/src/exec/schema_scanner.cpp 2 2 100.00% []
:large_blue_circle: be/src/storage/lake/tablet_write_log_manager.cpp 85 85 100.00% []
:large_blue_circle: be/src/storage/lake/tablet_write_log_manager.h 6 6 100.00% []
:large_blue_circle: be/src/storage/lake/compaction_task_context.h 3 3 100.00% []

github-actions[bot] avatar Dec 16 '25 02:12 github-actions[bot]

@cursor review

alvin-celerdata avatar Dec 16 '25 05:12 alvin-celerdata