spark-dbf
spark-dbf copied to clipboard
IncompatibleClassChangeError went executing val trips = sqlContext.dbfFile("trips1M.dbf")
Hello
I was able to build spark-dbf with Maven successfully after installing Shapefile download the trips1M.dbf
executed the Scala API ok until I executed the following
val trips = sqlContext.dbfFile("trips1M.dbf") then I got the following error.
java.lang.IncompatibleClassChangeError: class com.esri.spark.dbf.DBFRelation has interface org.apache.spark.sql.sources.PrunedScan as super class
Could you assist me in resolving this issue I try recompiling it and still get the same issue. Or is there a way to convert a list of dbf's into csv's.
Thanks
Configuration
I am using Hortonwork 2.3 HDP Spark 1.5.1
Single-node (VMWare ESXi-based) OS: CentOS 6.5
CPU: 4 vCPU
Mem: 12GB
3 HDs: 40GB, 40 GB, 35 GB (these contain an HDFS)
Sorry for the delay - this is a bigger "issue" as all has to move from SchemaRDD to DataFrame
Did you manage to solve this issue?
Sorry - not yet - on the todo list
btw- have u seen http://www.gdal.org/ogr2ogr.html
Okay. But this does not really help me as my DBF files are in hdfs. I want to read them from there.
what I'm proposing is that u convert the dbf to csv and place it in hdfs
Okay. We were doing the same thing before but I though of reading it directly and came accross your library.
I perfectly understand - just a matter of time - source code is open - u can fork it and adjust it.
Okay sure.
hello Using your code to read the DBF file, 6 blank lines appear. Do you appear The DBFFile size :2G The store path:HDFS spark-1.5.2 Use a third party jar package and report an error if read: Exception:Unexpected end of file
Can you take a look at it for me,thank you