BInwei Yang
BInwei Yang
**Describe the bug** on sr606, Q30's batch size exceeds the configured one. Configured 32k, get 93k. Possible operator: scan+proj, smj in hash join http://sr606:18080/history/application_1651817828242_0164/SQL/execution/?id=5 **To Reproduce** runq 30 on sr606,...
SQL: select c_last_name, c_first_name, max(ss_customer_sk) ss_customer_sk_max, min(ss_customer_sk) ss_customer_sk_min, max(cast(c_customer_sk as string)) c_customer_sk_max, min(cast(c_customer_sk as string)) c_customer_sk_min, s_store_name, max(c_birth_country) c_birth_country_max, min(c_birth_country) c_birth_country_min, max(ca_country) ca_country_max,min(ca_country) ca_country_min, max(upper(ca_country)) ca_country_u_max,min(upper(ca_country)) ca_country_u_min, s_zip, ca_zip from...
Allocate binary buffer from combine buffer as well allocate single combine buffer for all partitions
fix bug
**Is your feature request related to a problem or challenge? Please describe what you are trying to do.** currently during spill each reducer has a file, it's not scaling. The...
**Describe the bug** repartition impacted the null count number dfw=spark.read.format("arrow").load("/ss_customer_sk.parquet") dfw.where("ss_customer_sk is null").count() > 129583501 dfw.repartition(144).where("ss_customer_sk is null").count() > 64804994 **To Reproduce** **Expected behavior** the same number as without repartition...
The 3 configs are missing in Gluten. ``` val COLUMNAR_VELOX_MAX_SPILL_RUN_ROWS = buildConf("spark.gluten.sql.columnar.backend.velox.MaxSpillRunRows") .internal() .doc("The maximum row size of a single spill run") .bytesConf(ByteUnit.BYTE) .createWithDefaultString("12M") val COLUMNAR_VELOX_MAX_SPILL_BYTES = buildConf("spark.gluten.sql.columnar.backend.velox.MaxSpillBytes") .internal() .doc("The...
### Backend VL (Velox) ### Bug description returned row number is wrong after hash aggregate, reason may be the partial agg or final agg. Spark:  Gluten:  @zhztheplayer ###...
### Backend VL (Velox) ### Bug description Error message. ``` W20240621 15:14:01.929342 114227 Operator.cpp:641] Can't reclaim from memory pool op.5.0.0.Aggregation which is under non-reclaimable section, memory usage: 231.99MB, reservation: 232.00MB...
### Backend VL (Velox) ### Bug description ``` Error Source: RUNTIME Error Code: INVALID_STATE Reason: the key in unnest Operator only support field Retriable: False Expression: unnestFieldExpr != nullptr Function:...