starrocks icon indicating copy to clipboard operation
starrocks copied to clipboard

[Enhancement] Support CSV header row for EXPORT and INSERT INTO FILES

Open tracymacding opened this issue 2 weeks ago • 5 comments

This PR adds support for including a header row in CSV file exports:

  1. EXPORT statement: Add with_header property to include column names as the first row in exported CSV files. Example: EXPORT TABLE t TO "path" PROPERTIES ("with_header" = "true")

  2. INSERT INTO FILES: Add csv.include_header property for the same functionality when using INSERT INTO FILES with CSV format. Example: INSERT INTO FILES ("format"="csv", "csv.include_header"="true", ...)

Changes:

  • FE: Add property parsing in ExportStmt, ExportJob, ExportSink
  • FE: Add csv.include_header support in TableFunctionTable
  • BE: Add header writing logic in PlainTextBuilder for EXPORT
  • BE: Add header writing logic in CSVFileWriter for INSERT INTO FILES
  • Thrift: Add with_header field to TExportSink
  • Thrift: Add csv_include_header field to TTableFunctionTable

Why I'm doing:

What I'm doing:

Fixes #issue

What type of PR is this:

  • [ ] BugFix
  • [x] Feature
  • [ ] Enhancement
  • [ ] Refactor
  • [ ] UT
  • [ ] Doc
  • [ ] Tool

Does this PR entail a change in behavior?

  • [ ] Yes, this PR will result in a change in behavior.
  • [x] No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • [ ] Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • [ ] Parameter changes: default values, similar parameters but with different default values
  • [ ] Policy changes: use new policy to replace old one, functionality automatically enabled
  • [ ] Feature removed
  • [ ] Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • [x] I have added test cases for my bug fix or my new feature
  • [ ] This pr needs user documentation (for new or modified features or behaviors)
    • [ ] I have added documentation for my new feature or new function
  • [ ] This is a backport pr

Bugfix cherry-pick branch check:

  • [x] I have checked the version labels which the pr will be auto-backported to the target branch
    • [ ] 4.0
    • [ ] 3.5
    • [ ] 3.4
    • [ ] 3.3

[!NOTE] Adds optional CSV header rows, wiring FE properties through Thrift to BE writers for EXPORT and INSERT INTO FILES.

  • Frontend:
    • ExportStmt/ExportJob/ExportSink: parse with_header, collect column_names, and pass via TExportSink.
    • TableFunctionTable: parse csv.include_header for INSERT INTO FILES and set in TTableFunctionTable.
  • Backend:
    • ExportSinkOperator: build PlainTextBuilder with header options from TExportSink.
    • PlainTextBuilder: add with_header and column_names options; write header on init; ensure header on finish().
    • CSVFileWriter: add include_header option (parsed in factory) and write header row.
    • table_function_table_sink: propagate csv.include_header to CSV writer options.
  • Thrift:
    • TExportSink: add column_names and with_header.
    • TTableFunctionTable: add csv_include_header.
  • Tests:
    • Add exec/plain_text_builder_test.cpp; register in be/test/CMakeLists.txt.

Written by Cursor Bugbot for commit c13a357facfd0915679f203a5c9d9f1ca5a94d0d. This will update automatically on new commits. Configure here.

tracymacding avatar Dec 12 '25 01:12 tracymacding

🧪 CI Insights

Here's what we observed from your CI run for c13a357f.

🟢 All jobs passed!

But CI Insights is watching 👀

mergify[bot] avatar Dec 12 '25 01:12 mergify[bot]

[FE Incremental Coverage Report]

:x: fail : 10 / 22 (45.45%)

file detail

path covered_line new_line coverage not_covered_line_detail
:large_blue_circle: com/starrocks/load/ExportJob.java 0 6 00.00% [466, 467, 468, 469, 470, 472]
:large_blue_circle: com/starrocks/planner/ExportSink.java 0 6 00.00% [60, 61, 73, 76, 80, 136]
:large_blue_circle: com/starrocks/catalog/TableFunctionTable.java 1 1 100.00% []
:large_blue_circle: com/starrocks/sql/ast/ExportStmt.java 9 9 100.00% []

github-actions[bot] avatar Dec 12 '25 02:12 github-actions[bot]

@cursor review

alvin-celerdata avatar Dec 12 '25 04:12 alvin-celerdata

[Java-Extensions Incremental Coverage Report]

:white_check_mark: pass : 0 / 0 (0%)

github-actions[bot] avatar Dec 16 '25 08:12 github-actions[bot]

[BE Incremental Coverage Report]

:x: fail : 0 / 20 (00.00%)

file detail

path covered_line new_line coverage not_covered_line_detail
:large_blue_circle: src/exec/plain_text_builder.cpp 0 12 00.00% [27, 28, 48, 49, 50, 51, 52, 54, 55, 56, 57, 58]
:large_blue_circle: src/exec/pipeline/sink/export_sink_operator.cpp 0 8 00.00% [112, 113, 114, 116, 117, 118, 119, 123]

github-actions[bot] avatar Dec 16 '25 08:12 github-actions[bot]

@cursor review

alvin-celerdata avatar Dec 17 '25 02:12 alvin-celerdata