java.lang.ClassNotFoundException: Failed to find data source: tfrecord. Please find packages at http://spark.apache.org/third-party-projects.html
Hello authors,there exits some confuse when i try to run : df.write.format("tfrecord").save("hdfs://***/a")
java.lang.ClassNotFoundException: Failed to find data source: tfrecord. Please find packages at http://spark.apache.org/third-party-projects.html
And it's confuse that when i remove spark-mllib dependency the error mistakes . it's is any conflict between mllib and spark-tfrecord? I will appreciate that if you can handle my problem.Thanks again!
my dependency settings:
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_${scala.binary.version}</artifactId>
<version>2.4.3</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_${scala.binary.version}</artifactId>
<version>2.4.3</version>
</dependency>
my spark-tfrecord dependency:
I am not aware of any conflict with spark-mllib. Your error seems to suggest spark was not able to find the spark-tfrecord jar file.
Hi @Ethanhack , please use the correct spark versions as shown in the README.md,you may want to use spark 2.4, we don't support spark 2.2.
Version 0.1.x targets Spark 2.3 and Scala 2.11
Version 0.2.x targets Spark 2.4 and both Scala 2.11 and 2.12
Version 0.3.x targets Spark 3.0 and Scala 2.12
Version 0.4.x targets Spark 3.2 and Scala 2.12
Version 0.5.x targets Spark 3.2 and Scala 2.13
Version 0.6.x targets Spark 3.4 and both Scala 2.12 and 2.13