zhixingheyi-tian
zhixingheyi-tian
@HongW2019 If you have tested out the Spark TPCDS benchmark can run on GS. You can paste the script and result in this issue. Just for baseline.
CC @zhouyuan , Have passed Jenkins workloads
http://sr242:18080/history/application_1654056252902_0086/SQL/execution/?id=3 data:image/s3,"s3://crabby-images/e6912/e69121a6540dc9600406149b0d8ea625a789841c" alt="image" The Eventlog also shows Issue #928 is resolved by this patch.
This issue is caused by https://github.com/oap-project/gazelle_plugin/blob/a44b889a506cba43fa1d16e0c4b6ed3c7149cac4/native-sql-engine/cpp/src/codegen/arrow_compute/ext/array_item_index.h#L32-L33 ``` struct ArrayItemIndexS { uint16_t id = 0; uint16_t array_id = 0; ``` There are 287 recordbatches of row_number > 64K from CSHJ, and...
cc @zhouyuan Have passed Jenkins workloads.
By debugging,have figured out the cause was from Arrow:file_orc.cc ``` Result Execute() override { ... Result Next() { if (i_ == num_stripes_) { return nullptr; } std::shared_ptr batch; // TODO...
cc @zhouyuan @zhztheplayer
Have troubleshooted the cause, as below From ORC lib source code; https://github.com/apache/orc/blob/22828f79a526069d9629719c9476b7addad91ae6/c%2B%2B/src/Reader.cc#L120-L144 ``` void ColumnSelector::updateSelected(std::vector& selectedColumns, const RowReaderOptions& options) { selectedColumns.assign(static_cast(contents->footer->types_size()), false); if (contents->schema->getKind() == STRUCT && options.getIndexesSet()) { for(std::list::const_iterator...
@zhouyuan @zhztheplayer I think I can implement the same logic in arrow side like parquet to avoid modifying orc lib code. At the same time, could we can open a...
@dmsuehir Thanks very much. For Stock TensorFlow 2.5.0, if enable os.environ["TF_ENABLE_ONEDNN_OPTS"] = '1', does it mean completely equivalent to Intel TensorFlow 2.5.0 , and does it include all optimizations from...