gazelle_plugin issues

DPP is not working on DS V2 on Spark 321

1

**Describe the bug** DPP is broken, here's the example on TPC-DS Q1 ![image](https://user-images.githubusercontent.com/43917/159878851-577ae07a-1e42-4dd9-8727-2b73cd004041.png) **To Reproduce** run TPC-DS Q1 **Expected behavior** fix the DPP feature **Additional context** Add any other context...

zhouyuan

bug

Wip publish jar v2

1

## What changes were proposed in this pull request? Add developer, license, and plugin for publish jar ## How was this patch tested? Pass in local

weiting-chen

count null count bug

3

**Describe the bug** repartition impacted the null count number dfw=spark.read.format("arrow").load("/ss_customer_sk.parquet") dfw.where("ss_customer_sk is null").count() > 129583501 dfw.repartition(144).where("ss_customer_sk is null").count() > 64804994 **To Reproduce** **Expected behavior** the same number as without repartition...

FelixYBW

bug

Encounter "AWS Error [code 15]: No response body" when running tpcds q1 with gazelle and s3 storage on EC2

1

**Describe the bug** ![image](https://user-images.githubusercontent.com/19186785/167245894-fdbd986d-16c2-4578-9834-2a306fa52b5e.png) We are trying to use gazelle on AWS EC2 instance with s3 storage. It encounter one java.lang.RuntimeException when getting information for key 'tpcdata_arrow/shared/data/tpcds/tpcds_parquet/1/date_dim/part-00000-3b22a139-e878-4a3f-a1d0-9ef798c34388-c000.snappy.parquet' in bucket 'cloudtik-****':...

yao531441

bug

get_physical_plan in gazelle_analysis raise "error: unbalanced parenthesis at position 1"

**Describe the bug** When using get_physical_plan in gazelle_analysis, there is an issue raised "error: unbalanced parenthesis at position 1" **To Reproduce** Just use the gazelle_analysis to call get_physical_plan function. **Expected...

weiting-chen

bug

Add Spark version in artifact id for jar publish

**Describe the bug** Current framework cannot share below 3 jar files, 1. spark-arrow-datasource-standard, 2. spark-columnar-core, 3. spark-sql-columnar-shims-common I would suggest to add Spark version in the artifact id and create...

weiting-chen

bug

Shims to support multiple Spark versions

2

We will need to support multiple Spark versions on a single code base. The ideal solution is to use Shims to separate version specific code into a Shim layer specific...

jerrychenhf

bug

Exchange reuse is not working in Spark 3.2.1

1

**Describe the bug** TPC-DS Q2, columnar exchange is not reused. ``` == Physical Plan == AdaptiveSparkPlan (95) +- == Final Plan == ArrowColumnarToRow (56) +- ColumnarSort (55) +- ColumnarCustomShuffleReader (54)...

zhouyuan

bug

Installation failure with 1.3.0 version

**Describe the bug** [ERROR] [Error] : Symbol 'type org.apache.log4j.AppenderSkeleton' is missing from the classpath. This symbol is required by 'class org.apache.spark.SparkFunSuite.LogAppender'. Make sure that type AppenderSkeleton is in your classpath...

KueenLau

bug

Support to run with JDK11

**Is your feature request related to a problem or challenge? Please describe what you are trying to do.** I have tried to use JDK11 to compile and run project. However...

jackylee-ch

enhancement

gazelle_plugin
gazelle_plugin copied to clipboard

Metadata

DPP is not working on DS V2 on Spark 321

Wip publish jar v2

count null count bug

Encounter "AWS Error [code 15]: No response body" when running tpcds q1 with gazelle and s3 storage on EC2

get_physical_plan in gazelle_analysis raise "error: unbalanced parenthesis at position 1"

Add Spark version in artifact id for jar publish

Shims to support multiple Spark versions

Exchange reuse is not working in Spark 3.2.1

Installation failure with 1.3.0 version

Support to run with JDK11

← Metadata

Owner

Metadata

gazelle_plugin gazelle_plugin copied to clipboard

Metadata

← Metadata

Owner

Metadata

gazelle_plugin
gazelle_plugin copied to clipboard