Robert (Bobby) Evans
Robert (Bobby) Evans
Another odd example of this is +INF and -INF. Even if allowNonNumericNumbers is disabled +INF and -INF are valid floats and are normalized to "Infinity" and "-Infinity" respectively. And the...
Technically in Spark 4.0 this was reverted (at least for scan by default) https://issues.apache.org/jira/browse/SPARK-48148 https://github.com/apache/spark/pull/46408 This functionality was put under a config `spark.sql.json.enableExactStringParsing` with it on by default. It appears...
I was able to get the code to fallback, but I don't think that it matters at all. ``` val df1 = spark.read.parquet("/data/tpcds/SF200_parquet_decimal/store_sales/").select("ss_sold_date_sk", "ss_sold_time_sk").filter("ss_sold_date_sk = 0") val df2 = spark.read.parquet("/data/tpcds/SF200_parquet_decimal/store_sales/").selectExpr("ss_sold_date_sk",...
@liurenjie1024 why do we need to force it to be per-file? Is it because we don't want to merge files so we can insert in a count for the number...
https://github.com/rapidsai/cudf/pull/13373#issuecomment-2168287320
@binmahone sorry for the late reply. No it is not exactly the same as what happens today with tryMergeAggregatedBatches. tryMergeAggregatedBatches happens after an initial aggregation pass through all of the...
We probably also want to come up with a set of standardized benchmarks to cover this use case as NDS does not cover it well. https://github.com/NVIDIA/spark-rapids/pull/11376#issuecomment-2400253511 is a comment I...