datafusion-comet icon indicating copy to clipboard operation
datafusion-comet copied to clipboard

Add documentation on building with Spark 4

Open andygrove opened this issue 1 year ago • 4 comments

What is the problem the feature request solves?

I can run individual test suites against Spark 3.4 with the following command:

./mvnw test -DwildcardSuites="org.apache.comet.CometExpressionSuite" -P"spark-3.4"

If I change the Spark version to 4.0 then I get compilation errors:

[ERROR] /Users/andy/git/apache/datafusion-comet/common/src/main/java/org/apache/comet/parquet/AbstractColumnReader.java:26:33:  error: cannot access DataType
[ERROR] /Users/andy/git/apache/datafusion-comet/common/src/main/java/org/apache/comet/vector/CometVector.java:34:33:  error: cannot access Decimal
[ERROR] /Users/andy/git/apache/datafusion-comet/common/src/main/java/org/apache/comet/vector/CometVector.java:36:38:  error: cannot access ColumnVector
...

I would like to see some documentation in the installation guide and development guide on how to build and test locally with Spark 4.

Describe the potential solution

No response

Additional context

No response

andygrove avatar Jun 04 '24 14:06 andygrove

Hi @kazuyukitanimura. Would you mind handling this as part of the Spark 4 work you are working on?

andygrove avatar Jun 04 '24 14:06 andygrove

Yes, I am on it. Thank you @andygrove

kazuyukitanimura avatar Jun 04 '24 16:06 kazuyukitanimura

JDK 17 requirement of Spark could be one of the reasons

kazuyukitanimura avatar Jun 04 '24 19:06 kazuyukitanimura

Switching to JDK 17 resolved my issue. I will leave this issue open until we have documentation for building with Spark 4

andygrove avatar Jun 04 '24 20:06 andygrove

This can be closed now

andygrove avatar Apr 10 '25 16:04 andygrove