mongo-spark icon indicating copy to clipboard operation
mongo-spark copied to clipboard

upgrade to spark 3.2.0 and scala 2.13.7

Open kwark opened this issue 3 years ago • 1 comments

  • upgrade to spark 3.2.0 and scala 2.13.7
  • upgrade to sbt 1.5.5 (needed because sbt-scoverage for scala 2.13.x required it)
  • introduced build.sbt
  • remove scalariform (not supported anymore, EOL)
  • removed scalamock dependency
  • upgraded other dependencies

kwark avatar Nov 09 '21 08:11 kwark

👍

dohr-michael avatar Feb 02 '22 16:02 dohr-michael

Spark 3.2.0 support is available via the 10.0.x version of the connector.

rozza avatar Aug 18 '22 10:08 rozza

Hello @rozza,

In your library you are including spark libraries for scala 2.12, by this way, if we want to use your library in a scala 2.13 project we need to exclude all your spark dependencies and we need to add a migration library provided by scala (if we don't do that we will receive MethodNotFound exception in scala Seq). In this state your library is not compatible directely with scala 2.13.

https://docs.scala-lang.org/overviews/core/collections-migration-213.html

dohr-michael avatar Aug 18 '22 12:08 dohr-michael

Hi @dohr-michael,

The Spark 10.0.x version is written purely in Java so was intended to work without issue across scala versions.

However, I see there was an issue using Seq. I've fixed in SPARK-361 - which will be released in Spark 10.0.4. I have also added SPARK-360 to ensure tests are run against a Scala 2.13 version of Spark.

rozza avatar Aug 18 '22 13:08 rozza

You are using some 'scala' features to make the gateway with spark (e.g. https://github.com/mongodb/mongo-spark/blob/662b2990c6d179de8f93365d9107ae9b6fc9015a/src/main/java/com/mongodb/spark/sql/connector/schema/RowToBsonDocumentConverter.java#L60) and in this case this functionality is deprecated in scala 2.13 (to prepare migration to scala 3), so your library is not purely Java, you have written the code in java but you need to use some scala functionnality and you will have issues in next scala upgrade (scala 3)

dohr-michael avatar Aug 18 '22 13:08 dohr-michael

Thanks @dohr-michael,

Unfortunately, the point of the Spark Datasource v2 being designed using Java interfaces was intended to make writing connectors simpler for libraries such as ours. However, Scala not being compatible across minor versions is a pain. I will have to add some bridging code as part of SPARK-360 work.

I don't know if there is a roadmap for Spark to be compiled with Scala 3.

rozza avatar Aug 18 '22 14:08 rozza

I do not know either, but i've saw some tests to use spark 3.2.0-2.13 directly with scala 3 (thx compatibility 2.13 - 3.0.0 https://docs.scala-lang.org/scala3/guides/migration/compatibility-intro.html).

We don't have tested yet scala 3 for our spark projects, but we are looking to migrate our applications to scala 3 (play and ZIO applications).

dohr-michael avatar Aug 18 '22 14:08 dohr-michael