spark issues

[SPARK-40086][SQL] Improve AliasAwareOutputPartitioning to take all aliases into account

2

### What changes were proposed in this pull request? Currently `AliasAwareOutputPartitioning` takes only the last alias by aliased expressions into account. We could avoid more shuffles with better alias handling....

peter-toth

SQL

[SPARK-39989][SQL][FollowUp] Improve foldable expression stats estimate for string and binary

1

### What changes were proposed in this pull request? This PR improves the foldable expression statistics estimation by providing more accurate min, max, and data length for string and binary...

linhongliu-db

SQL

[SPARK-40306][SQL]Support more than Integer.MAX_VALUE of the same join key

2

### What changes were proposed in this pull request? Support more than Integer.MAX_VALUE of the same join key. ### Why are the changes needed? For SMJ, the number of the...

wankunde

SQL

CORE

PYTHON

[WIP] investigate the root cause for SPARK-40165

1

### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested?

panbingkun

SQL

BUILD

INFRA

[SPARK-40248][SQL] Use larger number of bits to build Bloom filter

2

### What changes were proposed in this pull request? This PR makes Bloom filter join use larger number of bits to build Bloom filter if row count is exist. ###...

wangyum

SQL

[SPARK-40288][SQL] After `RemoveRedundantAggregates`, `PullOutGroupingExpressions` should applied to avoid attribute missing when use complex expression

1

### What changes were proposed in this pull request? Atfter RemoveRedundantAggregates rule, we should pull the complex group by expression out. ### Why are the changes needed? This will fix...

hgs19921112

SQL

add sparksql wirte mysql support update ,the design from replace into…

2

The modification points are: Spark SQL writing MySQL supports update Background and purpose In the current big data scenario, when writing to the MySQL relational database, the redundancy of data...

Datawaiter

SQL

fix the question of SparkSQL call iceberg's expire_snapshots procedur…

3

…es blocking in local model ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How...

StephenQQ

CORE

[SPARK-40314][SQL][PYTHON] Add scala and python bindings for inline and inline_outer

1

### What changes were proposed in this pull request? Adds Scala and Python bindings for SQL functions inline and inline_outer ### Why are the changes needed? Currently these functions can...

Kimahriman

SQL

CORE

PYTHON

[SPARK-40259][SQL] Support Parquet DSv2 in subquery plan merge

1

### What changes were proposed in this pull request? After https://github.com/apache/spark/pull/32298 we were able to merge scalar subquery plans, but DSv2 sources couldn't benefit from that improvement. The reason for...

peter-toth

SQL

spark
spark copied to clipboard

Metadata

[SPARK-40086][SQL] Improve AliasAwareOutputPartitioning to take all aliases into account

[SPARK-39989][SQL][FollowUp] Improve foldable expression stats estimate for string and binary

[SPARK-40306][SQL]Support more than Integer.MAX_VALUE of the same join key

[WIP] investigate the root cause for SPARK-40165

[SPARK-40248][SQL] Use larger number of bits to build Bloom filter

[SPARK-40288][SQL] After `RemoveRedundantAggregates`, `PullOutGroupingExpressions` should applied to avoid attribute missing when use complex expression

add sparksql wirte mysql support update ,the design from replace into…

fix the question of SparkSQL call iceberg's expire_snapshots procedur…

[SPARK-40314][SQL][PYTHON] Add scala and python bindings for inline and inline_outer

[SPARK-40259][SQL] Support Parquet DSv2 in subquery plan merge

← Metadata

Owner

Metadata

spark spark copied to clipboard

Metadata

← Metadata

Owner

Metadata

spark
spark copied to clipboard