spark icon indicating copy to clipboard operation
spark copied to clipboard

Apache Spark - A unified analytics engine for large-scale data processing

Results 649 spark issues
Sort by recently updated
recently updated
newest added

### What changes were proposed in this pull request? This pr aims to upgrade slf4j from 1.7.36 to 2.0.x, the main change as follows: 1. Upgrade slf4j version from 1.7.36...

BUILD
CORE

### What changes were proposed in this pull request? Implement `GroupBy.prod` ### Why are the changes needed? for API coverage ### Does this PR introduce _any_ user-facing change? yes, the...

CORE
PYTHON
PANDAS API ON SPARK

### What changes were proposed in this pull request? When running spark application against spark 3.3, I see the following : ``` java.lang.IllegalArgumentException: Unsupported data source V2 partitioning type: CustomPartitioning...

SQL

### What changes were proposed in this pull request? This PR changes the behavior of how columns with mixing dates and timestamps are supported in CSV schema inference and data...

SQL
DOCS

### What changes were proposed in this pull request? In the PR, I propose to migrate all parsing errors onto temporary error classes with the prefix `_LEGACY_ERROR_TEMP_`. The error message...

SQL
DOCS
CORE

### What changes were proposed in this pull request? This PR is a follow-up PR for #32364. It has been closed by github-actions because it hasn't been updated in a...

SQL
ML
MLLIB
STRUCTURED STREAMING
KUBERNETES
WEB UI
BUILD
YARN
DOCS
CORE
INFRA
PYTHON
R
DSTREAM
AVRO
PANDAS API ON SPARK

### What changes were proposed in this pull request? Spark 3.4 support Python 3.7+ , but python related UTs only check python executable exist, not check the python version. So...

SQL
BUILD
YARN
CORE
PYTHON

### What changes were proposed in this pull request? This PR adds the test suites for #37893, applyInPandasWithState. The new test suite mostly ports E2E test cases from existing [flatMapGroupsWithState](https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/streaming/FlatMapGroupsWithStateSuite.scala)....

SQL
STRUCTURED STREAMING
BUILD
CORE
PYTHON

### What changes were proposed in this pull request? This is a draft change of the current state of the Spark Connect prototype implemented as a driver plugin to separate...

SQL
BUILD
DOCS
CORE
INFRA
PYTHON
CONNECT

### What changes were proposed in this pull request? This PR proposes to introduce the new API `applyInPandasWithState` in PySpark, which provides the functionality to perform arbitrary stateful processing in...

SQL
STRUCTURED STREAMING
CORE
PYTHON