spark
spark copied to clipboard
Apache Spark - A unified analytics engine for large-scale data processing
### What changes were proposed in this pull request? This PR addresses a type casting issue in Spark SQL where comparing a string column to an integer value results in...
### What changes were proposed in this pull request? Introducing two new error classes instead of _LEGACY_ERROR_TEMP_2000. Classes introduced: - DATETIME_ARGUMENT_OUT_OF_RANGE - ILLEGAL_INTERVAL_ARGUMENT_VALUE ### Why are the changes needed? We...
### What changes were proposed in this pull request? Replace AnyRefMap with HashMap ### Why are the changes needed? HashMap has better performance in Scala 2.13: https://issues.apache.org/jira/browse/SPARK-49491 ### Does this...
### What changes were proposed in this pull request? This PR: - Adds support for DATE_TRUNC in V2 optimization pushdown - Consumes this pushdown for Postgres & H2 Connectors ###...
### What changes were proposed in this pull request? DataFrameReader has 3 APIs for JSON reading json(DataSet[String]) json(Rdd) json(filePath) First two APIs respects provided user schema nullability when spark flag...
### What changes were proposed in this pull request? In this PR I propose that we add classification of JDBC driver exception that represent syntax errors on external databases. Queries...
### What changes were proposed in this pull request? This PR proposes to provide proper error conditions for `_LEGACY_ERROR_TEMP_2000` and improve the error message ### Why are the changes needed?...
### What changes were proposed in this pull request? This PR adds support to define event time column in the output dataset of `TransformWithStateInPandas` operator. The new event time column...
### What changes were proposed in this pull request? This PR fixes the pushdown of ^ operator (XOR operator) for Postgres. Those two databases use this as exponent, rather then...