spark icon indicating copy to clipboard operation
spark copied to clipboard

Apache Spark - A unified analytics engine for large-scale data processing

Results 649 spark issues
Sort by recently updated
recently updated
newest added

### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested?

ML

### What changes were proposed in this pull request? New case is added in Boolean simplification to convert condition of form (a==b) || (a==null&&b==null) to ab. ### Why are the...

SQL

### What changes were proposed in this pull request? ViewCatalog API described in [SPIP](https://docs.google.com/document/d/1XOxFtloiMuW24iqJ-zJnDzHl2KMxipTjJoxleJFz66A/edit?usp=sharing). ### Why are the changes needed? First step towards DataSourceV2 view support. ### Does this PR...

SQL

### What changes were proposed in this pull request? The PR implements a mechanism to get credentials for a cloud service like AWS from an external credentials provider, and share...

DOCS
CORE

### What changes were proposed in this pull request? Remove an unnecessary parameter of the` PartitionedFileUtil.splitFiles` ### Why are the changes needed? Make code clearer. ### Does this PR introduce...

SQL

### What changes were proposed in this pull request? Port range specification is supported by specifying the starting port and configuring the number of port retries.Introduction of a new parameters:...

CORE

### What changes were proposed in this pull request? Move LIMIT/OFFSET CheckAnalysis error messages to use the new error framework. This will help improve the usability of Apache Spark as...

SQL
CORE

### What changes are proposed in this pull request? This proposes to add SQLMetrics instrumentation for Python UDF execution, including Pandas UDF, and related operations such as MapInPandas and MapInArrow....

SQL
BUILD
DOCS
CORE
PYTHON

DO-NOT-MERGE! May need to split the PR to multiple reviewable size of PRs. Use this only for reference. ### What changes were proposed in this pull request? ### Why are...

SQL
STRUCTURED STREAMING
BUILD
CORE
PYTHON

### What changes were proposed in this pull request? Add two parameters to allow customizing maxBroadcastTableBytes and maxBroadcastRows ### Why are the changes needed? Recently, we encountered some driver OOM...

SQL