Jin Chengcheng
Jin Chengcheng
remove fastpfor ## What changes were proposed in this pull request? Now it use my Velox and Arrow branch, after my [PR ](https://github.com/apache/arrow/pull/13614)merged, will use upstream Arrow And gazelle-cpp will...
oap-project Arrow change some field to private, and remove codec from VectorLoader, ignore the initial design, to fix that, change the decompress procedure to Java decompress
When I test read TPCDS parquet table, I receive coredump because bitWidth is 0 sql: ``` select s_store_sk from store where s_state = 'TN'; ``` substrait plan: ``` {"extensions":[{"extensionFunction":{"name":"is_not_null:str"}},{"extensionFunction":{"functionAnchor":1,"name":"equal:str_str"}},{"extensionFunction":{"functionAnchor":2,"name":"and:bool_bool"}}],"relations":[{"root":{"input":{"project":{"common":{"direct":{}},"input":{"read":{"common":{"direct":{}},"baseSchema":{"names":["s_store_sk","s_state"],"struct":{"types":[{"i64":{"nullability":"NULLABILITY_NULLABLE"}},{"string":{"nullability":"NULLABILITY_NULLABLE"}}]},"partitionColumns":{"columnType":["NORMAL_COL","NORMAL_COL"]}},"filter":{"scalarFunction":{"functionReference":2,"outputType":{"bool":{"nullability":"NULLABILITY_NULLABLE"}},"arguments":[{"value":{"scalarFunction":{"outputType":{"bool":{"nullability":"NULLABILITY_REQUIRED"}},"arguments":[{"value":{"selection":{"directReference":{"structField":{"field":1}}}}}]}}},{"value":{"scalarFunction":{"functionReference":1,"outputType":{"bool":{"nullability":"NULLABILITY_NULLABLE"}},"arguments":[{"value":{"selection":{"directReference":{"structField":{"field":1}}}}},{"value":{"literal":{"string":"TN"}}}]}}}]}},"localFiles":{"items":[{"uriFile":"file:///tmp/tpcds-generated/store/part-00000-4de8c37f-0e0f-4016-bd78-421a3af40edd-c000.snappy.parquet","length":"9171","parquet":{}}]}}},"expressions":[{"selection":{"directReference":{"structField":{}}}}]}},"names":["s_store_sk#3631"]}}]} ```...
### Rationale for this change ### What changes are included in this PR? Support to add ArrowSchema to specify C++ CsvFragmentScanOptions.convert_options.column_types And use Map to set the config, serialize in...
Spark implement: https://github.com/apache/spark/blob/branch-3.5/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ToPrettyString.scala This is an internal spark function.
Spark implementation: https://github.com/apache/spark/blob/branch-3.5/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/SortOrder.scala#L167
To track the OrderBy performance and measure order by optimization performance, add this benchmark. Initial benchmark result is: ``` ============================================================================ [...]/exec/benchmarks/OrderByBenchmark.cpp relative time/iter iters/s ============================================================================ OrderBy_no-payload_1_bigint_0.01k 9.29ms 107.64 OrderBy_no-payload_2_bigint_0.01k 10.06ms...
Each of the decimal operation functions is registered as two functions such as `add_deny_precision_loss` and `add`. When allowing precision loss, establishing the result type of an arithmetic operation happens according...
Remove velox to row and then UnsafeProjection, then row to velox. Introduce ArrowProjection to support write string and binary to Arrow directly. MutableProjection writes result to buffer and then write...