datafusion-comet
datafusion-comet copied to clipboard
feat: Implement to_json for subset of types
Which issue does this PR close?
Closes https://github.com/apache/datafusion-comet/issues/631
Rationale for this change
This is part of our effort to support more operations on complex types.
Performance seems ok.
AMD Ryzen 9 7950X3D 16-Core Processor
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
to_json 147 152 2 1.4 721.3 1.0X
to_json: Comet (Scan) 141 145 3 1.4 692.0 1.0X
to_json: Comet (Scan, Exec) 80 88 5 2.6 391.9 1.8X
What changes are included in this PR?
This PR adds an implementation of to_json that works for a subset of types (structs, primitives, strings). There is no support for date or timestamp yet.
How are these changes tested?
New Rust tests and Spark tests.
Codecov Report
Attention: Patch coverage is 50.00000% with 16 lines in your changes missing coverage. Please review.
Project coverage is 33.82%. Comparing base (
4fe43ad) to head (5c2f551).
| Files | Patch % | Lines |
|---|---|---|
| .../scala/org/apache/comet/serde/QueryPlanSerde.scala | 50.00% | 9 Missing and 7 partials :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## main #805 +/- ##
============================================
- Coverage 33.94% 33.82% -0.12%
+ Complexity 874 871 -3
============================================
Files 112 112
Lines 42916 42887 -29
Branches 9464 9456 -8
============================================
- Hits 14567 14508 -59
- Misses 25379 25395 +16
- Partials 2970 2984 +14
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
@parthchandra could you review?
i would also love to review this. will plan it for tomorrow
@eejbyfeldt @Kimahriman you may also be interested in reviewing this one
LGTM
Thanks for the review @eejbyfeldt! It is really appreciated. Some very good feedback there. I will address the feedback over the next day or two.
Sorry @andygrove for this late review. I don't know if one can improve on @eejbyfeldt's review. To address some of the handling of escape characters, should we look at using something like serde_json ?
@parthchandra @eejbyfeldt This is ready for another review