datafusion-comet icon indicating copy to clipboard operation
datafusion-comet copied to clipboard

Dropping Spark 3.3 support

Open huaxingao opened this issue 1 year ago • 4 comments

Since we have already integrated Spark 3.5.1 and are experimenting with Spark 4.0.0, should we discontinue support for Spark 3.3 to ease the burden of maintaining compatibility across multiple Spark versions?

huaxingao avatar Jul 09 '24 19:07 huaxingao

What is the most burden for now for Spark 3.3 support?

viirya avatar Jul 09 '24 21:07 viirya

+1 To be accurate, only experimental support is provided for Spark 3.5.1 and 4.0.0 as of now though

kazuyukitanimura avatar Jul 10 '24 15:07 kazuyukitanimura

There are a couple of considerations here -

  1. What version of Spark users are likely to be on (and therefore likely to want to use Comet with)?
  2. What are the currently available versions of Spark?

For the first question, users could be on any of the earlier versions of Spark like 3.2 or 3.3 and we have had support for these versions, but because we have not done a release users will have no way to take advantage of the work we did to support those versions. Even if we do not do a release, if we tag the source to identify the last point at which we had support for Spark 3.2 or Spark 3.3, users could potentially build their own Comet and try with their current version of Spark. Once we have tagged the source, we can then drop the support. For the second question the current versions of Spark available are 3.4.3, and 3.5.1 (https://spark.apache.org/downloads.html) so it seems we could safely drop support for 3.3 as soon as we are sure that 3.5 support is robust.

On Wed, Jul 10, 2024 at 8:53 AM KAZUYUKI TANIMURA @.***> wrote:

+1 To be accurate, only experimental support is provided for Spark 3.5.1 and 4.0.0 as of now though

— Reply to this email directly, view it on GitHub https://github.com/apache/datafusion-comet/issues/646#issuecomment-2220891918, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABR2A4AP4TZE7LJZIWODH2LZLVKBHAVCNFSM6AAAAABKTQQIK6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRQHA4TCOJRHA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

parthchandra avatar Jul 10 '24 16:07 parthchandra

but because we have not done a release users will have no way to take advantage of the work we did to support those versions

This is a good point. I am hoping that we can release 0.1.0 very soon (perhaps as soon as next week) so maybe we can wait until then before removing 3.3 support

andygrove avatar Jul 11 '24 14:07 andygrove

I would now like to suggest that we drop support for Spark 3.3 once we have released Comet 0.7.0.

The main motivation is that we need to add new CI checks for the new native scans which is going to lead to longer CI time on PRs, and removing 3.3 from the matrix will help with that.

andygrove avatar Mar 06 '25 18:03 andygrove