spark-tfrecord icon indicating copy to clipboard operation
spark-tfrecord copied to clipboard

java.lang.ClassNotFoundException: Failed to find data source: tfrecord. Please find packages at http://spark.apache.org/third-party-projects.html

Open Ethanhack opened this issue 2 years ago • 3 comments

Hello authors,there exits some confuse when i try to run : df.write.format("tfrecord").save("hdfs://***/a")

java.lang.ClassNotFoundException: Failed to find data source: tfrecord. Please find packages at http://spark.apache.org/third-party-projects.html

And it's confuse that when i remove spark-mllib dependency the error mistakes . it's is any conflict between mllib and spark-tfrecord? I will appreciate that if you can handle my problem.Thanks again!

my dependency settings: <java.version>1.8</java.version> <maven.compiler.source>${java.version}</maven.compiler.source> <maven.compiler.target>${java.version}</maven.compiler.target> UTF-8 <scala.version>2.11.12</scala.version> <scala.binary.version>2.11</scala.binary.version> <spark.version>2.2.0</spark.version>

    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-mllib_${scala.binary.version}</artifactId>
        <version>2.4.3</version>
    </dependency>

    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-mllib_${scala.binary.version}</artifactId>
        <version>2.4.3</version>
    </dependency>

Ethanhack avatar Jul 18 '23 08:07 Ethanhack

my spark-tfrecord dependency: <groupId>com.linkedin.sparktfrecord</groupId> <artifactId>spark-tfrecord_2.11</artifactId> 0.2.6

Ethanhack avatar Jul 18 '23 08:07 Ethanhack

I am not aware of any conflict with spark-mllib. Your error seems to suggest spark was not able to find the spark-tfrecord jar file.

junshi15 avatar Jul 22 '23 13:07 junshi15

Hi @Ethanhack , please use the correct spark versions as shown in the README.md,you may want to use spark 2.4, we don't support spark 2.2.

Version 0.1.x targets Spark 2.3 and Scala 2.11
Version 0.2.x targets Spark 2.4 and both Scala 2.11 and 2.12
Version 0.3.x targets Spark 3.0 and Scala 2.12
Version 0.4.x targets Spark 3.2 and Scala 2.12
Version 0.5.x targets Spark 3.2 and Scala 2.13
Version 0.6.x targets Spark 3.4 and both Scala 2.12 and 2.13

mizhou-in avatar Jul 24 '23 07:07 mizhou-in