renaissance icon indicating copy to clipboard operation
renaissance copied to clipboard

Spark related issue on HP-UX ia64

Open MBaesken opened this issue 5 years ago • 3 comments

When running the renaissance benchmark on HP-UX (ia64) with a JDK8, we run into the following error message. It might be related to this Spark issue : https://github.com/apache/spark/commit/e1f6845391078726f60e760f0ea68ccf81f9eca9#diff-c7483c7efce631c783676f014ba2b0ed (seems an older Spark version is used in the benchmark , that does not check for the ability of the platform to do unaligned accesses)

error output

8/bin/java  -jar renaissance-mit-0.9.0.jar all
    ….
 
19/06/11 11:28:08 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 10.66.65.205, 55276)
====== log-regression (apache-spark), iteration 0 started ======
19/06/11 11:28:16 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
java.lang.InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code
        at org.apache.spark.unsafe.Platform.getDouble(Platform.java:120)
        at org.apache.spark.sql.catalyst.expressions.UnsafeArrayData.getDouble(UnsafeArrayData.java:218)
        at org.apache.spark.sql.catalyst.util.ArrayData.toDoubleArray(ArrayData.scala:103)
        at org.apache.spark.ml.linalg.VectorUDT.deserialize(VectorUDT.scala:74)
        at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificSafeProjection.apply(Unknown Source)
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
        at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:214)
        at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:919)
        at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:910)
        at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:866)
        at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:910)
        at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:668)
        at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:330)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:281)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
        at org.apache.spark.scheduler.Task.run(Task.scala:85)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:836)
19/06/11 11:28:16 ERROR Executor: Exception in task 1.0 in stage 0.0 (TID 1)
java.lang.InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code
        at org.apache.spark.unsafe.Platform.getInt(Platform.java:72)
        at org.apache.spark.sql.catalyst.expressions.UnsafeArrayData.getElementOffset(UnsafeArrayData.java:67)
        at org.apache.spark.sql.catalyst.expressions.UnsafeArrayData.getDouble(UnsafeArrayData.java:216)
        at org.apache.spark.sql.catalyst.util.ArrayData.toDoubleArray(ArrayData.scala:103)
        at org.apache.spark.ml.linalg.VectorUDT.deserialize(VectorUDT.scala:74)
        at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificSafeProjection.apply(Unknown Source)
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
        at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:214)
        at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:919)
        at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:910)
        at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:866)
        at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:910)
        at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:668)
        at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:330)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:281)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
        at org.apache.spark.scheduler.Task.run(Task.scala:85)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:836)
19/06/11 11:28:16 ERROR SparkUncaughtExceptionHandler: Uncaught exception in thread Thread[Executor task launch worker-0,5,main]
java.lang.InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code
        at org.apache.spark.unsafe.Platform.getDouble(Platform.java:120)
        at org.apache.spark.sql.catalyst.expressions.UnsafeArrayData.getDouble(UnsafeArrayData.java:218)
        at org.apache.spark.sql.catalyst.util.ArrayData.toDoubleArray(ArrayData.scala:103)
        at org.apache.spark.ml.linalg.VectorUDT.deserialize(VectorUDT.scala:74)
        at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificSafeProjection.apply(Unknown Source)
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
        at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:214)
        at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:919)
        at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:910)
        at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:866)
        at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:910)
        at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:668)
        at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:330)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:281)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
        at org.apache.spark.scheduler.Task.run(Task.scala:85)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:836)
19/06/11 11:28:16 ERROR SparkUncaughtExceptionHandler: Uncaught exception in thread Thread[Executor task launch worker-1,5,main]
java.lang.InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code
        at org.apache.spark.unsafe.Platform.getInt(Platform.java:72)
        at org.apache.spark.sql.catalyst.expressions.UnsafeArrayData.getElementOffset(UnsafeArrayData.java:67)
        at org.apache.spark.sql.catalyst.expressions.UnsafeArrayData.getDouble(UnsafeArrayData.java:216)
        at org.apache.spark.sql.catalyst.util.ArrayData.toDoubleArray(ArrayData.scala:103)
        at org.apache.spark.ml.linalg.VectorUDT.deserialize(VectorUDT.scala:74)
        at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificSafeProjection.apply(Unknown Source)
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
        at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:214)
        at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:919)
        at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:910)
        at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:866)
        at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:910)
        at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:668)
        at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:330)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:281)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
        at org.apache.spark.scheduler.Task.run(Task.scala:85)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:836)
19/06/11 11:28:16 ERROR TaskSetManager: Task 0 in stage 0.0 failed 1 times; aborting job

MBaesken avatar Jun 13 '19 10:06 MBaesken

Hi Matthias & SAPMachine team,

Thanks a lot for the bug report ! I am afraid there is no simple workaround for that issue. The best would be to upgrade Spark to a version that would work on ia64. I am then opening issue #152 to track our effort of upgrading all Spark benchmarks.

farquet avatar Jun 13 '19 13:06 farquet

Thanks for reporting this!

Indeed, we should upgrade the Spark benchmarks.

The current approach that we are planning is to open a new subproject that depends on a different Spark version, and port the existing Spark benchmarks there. The current ones, running on the old Spark version, would be kept for archival purposes (but no longer part of the official release). Since some of the archived benchmarks might not be compatible with certain platforms (as is the case here), I think it might make sense to consider adding information about platform compatibility to benchmarks (same as with JDK compatibility), as part of the API/SPI redesign (https://github.com/renaissance-benchmarks/renaissance/issues/81). cc @lbulej

axel22 avatar Jun 13 '19 15:06 axel22

The master branch now uses Spark 3.0.1 and it seems that we should be able to move easily to Spark 3.1.1 (see #247).

lbulej avatar Apr 30 '21 12:04 lbulej