databend icon indicating copy to clipboard operation
databend copied to clipboard

tracing file Formats

Open youngsofun opened this issue 3 years ago • 0 comments

Summary

refacors

  • [x] new input pipeline
    • [x] new framework & replace copy https://github.com/datafuselabs/databend/pull/7613
    • [x] replace all
      • [x] streaming load
        • [x] https://github.com/datafuselabs/databend/pull/7756
        • [x] https://github.com/datafuselabs/databend/pull/7769
      • [x] clickhouse https://github.com/datafuselabs/databend/pull/7843
      • [x] rm unused code https://github.com/datafuselabs/databend/pull/7854
  • [ ] refactor2
    • [x] https://github.com/datafuselabs/databend/pull/8566
    • [x] https://github.com/datafuselabs/databend/pull/8700
    • [ ] refactor input https://github.com/datafuselabs/databend/pull/8778
    • [ ] refactor JsonValue encoder/decoder
    • [ ] remove FormatSettings.

speed up

  • [x] Parallel read

    • [x] TSV/ndjson (read beyond split boundary) https://github.com/datafuselabs/databend/pull/8199
    • [x] parquet (make big file loadable) https://github.com/datafuselabs/databend/pull/7903
  • [x] refactor NestedBufferReader https://github.com/datafuselabs/databend/issues/8486

    • [x] https://github.com/datafuselabs/databend/pull/8716
    • [ ] https://github.com/datafuselabs/databend/pull/8733

distributed copy

  • [ ] https://github.com/datafuselabs/databend/issues/6395 @zhang2014 @RinChanNOWWW

streaming copy

  • [ ] https://github.com/datafuselabs/databend/issues/7889 @Xuanwo

compact

  • [x] https://github.com/datafuselabs/databend/issues/7760
    • [x] https://github.com/datafuselabs/databend/pull/7927
  • [x] https://github.com/datafuselabs/databend/pull/7927
  • [x] https://github.com/datafuselabs/databend/pull/7948
  • [x] https://github.com/datafuselabs/databend/pull/8644
  • [x] https://github.com/datafuselabs/databend/issues/8488

features

  • [ ] https://github.com/datafuselabs/databend/issues/8541
  • result
    • [ ] https://github.com/datafuselabs/databend/pull/8375
  • [ ] format settings/options
    • [ ] csv quote
    • [ ] skip
    • [ ] null default/ bool
    • [ ] limit
    • [ ] error tolerate

copy

  • [ ] copy return status as SQL results
  • [ ] per file progress/affect
  • [x] https://github.com/datafuselabs/databend/issues/8642

streaming load

  • [x] https://github.com/datafuselabs/databend/issues/8604
  • [ ] https://github.com/datafuselabs/databend/issues/8243
  • [ ] https://github.com/datafuselabs/databend/issues/assigned/youngsofun

error handling

  • [ ] https://github.com/datafuselabs/databend/issues/6936

create format

  • [x] https://github.com/datafuselabs/databend/issues/8100

parquet

  • [ ] https://github.com/datafuselabs/databend/issues/8661

TSV

  • [x] https://github.com/datafuselabs/databend/pull/8606
  • [x] https://github.com/datafuselabs/databend/issues/8579

CSV

  • [x] https://github.com/datafuselabs/databend/pull/8527
  • [x] https://github.com/datafuselabs/databend/pull/8532
  • [x] https://github.com/datafuselabs/databend/pull/8459
  • [x] https://github.com/datafuselabs/databend/pull/8698
  • [x] https://github.com/datafuselabs/databend/issues/8088

other format

  • [ ] orc/arrow/
    • [ ] https://github.com/datafuselabs/databend/issues/7655
    • [x] https://github.com/datafuselabs/databend/issues/8016
  • [ ] avro https://github.com/datafuselabs/databend/issues/8017
  • [ ] execl https://github.com/datafuselabs/databend/issues/7654

test

  • [x] https://github.com/datafuselabs/databend/issues/8101
  • [ ] add unit tests for new impls
    • can refer to the deleted cases for old impls https://github.com/datafuselabs/databend/pull/7854)

youngsofun avatar Sep 19 '22 12:09 youngsofun