iceberg
iceberg copied to clipboard
ClassCastException with spark.sql.datetime.java8API.enabled=true
- Spark Version: 3.2.0
- Iceberg Version: 0.13.1
When spark.sql.datetime.java8API.enabled=true
is set,
Doing Rewrite manifest on a date partitioned table throws the following exception:
Job aborted due to stage failure: Task 0 in stage 36333.0 failed 5 times, most recent failure: Lost task 0.4 in stage 36333.0 (TID 140410) (ip-123.us-west-2.compute.internal executor 77): java.lang.ClassCastException: java.time.LocalDate cannot be cast to java.sql.Date
at org.apache.iceberg.spark.SparkValueConverter.convert(SparkValueConverter.java:77)
at org.apache.iceberg.spark.SparkStructLike.get(SparkStructLike.java:48)
at org.apache.iceberg.PartitionSummary.updateFields(PartitionSummary.java:59)
at org.apache.iceberg.PartitionSummary.update(PartitionSummary.java:51)
at org.apache.iceberg.ManifestWriter.addEntry(ManifestWriter.java:87)
at org.apache.iceberg.ManifestWriter.existing(ManifestWriter.java:135)
at org.apache.iceberg.spark.actions.BaseRewriteManifestsSparkAction.writeManifest(BaseRewriteManifestsSparkAction.java:332)
at org.apache.iceberg.spark.actions.BaseRewriteManifestsSparkAction.lambda$toManifests$afb7bc39$1(BaseRewriteManifestsSparkAction.java:354)
at org.apache.spark.sql.Dataset.$anonfun$mapPartitions$1(Dataset.scala:2867)
at org.apache.spark.sql.execution.MapPartitionsExec.$anonfun$doExecute$3(objects.scala:201)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:898)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:898)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:133)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1474)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
This is caused by casting https://github.com/apache/iceberg/blob/0.13.x/spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/SparkValueConverter.java#L77
+1 ,i meet the same thing
+1 facing similar issue when trying to re-write manifests.
Caused by: java.lang.ClassCastException: class java.time.LocalDate cannot be cast to class java.sql.Date (java.time.LocalDate is in module java.base of loader 'bootstrap'; java.sql.Date is in module java.sql of loader 'platform')
Added a pr for the fix : https://github.com/apache/iceberg/pull/5860
This issue has been resolved by #5860.