CopybookInputFormat icon indicating copy to clipboard operation
CopybookInputFormat copied to clipboard

NPE on spark execution, which is a warning

Open ameet123 opened this issue 6 years ago • 1 comments

Hi, I intermittently get an null pointer exception while running a spark job. The stack trace is:

18/03/08 11:16:40 WARN scheduler.TaskSetManager: Lost task 446.0 in stage 0.0 (TID 513, dwbdtest1r1w4.wellpoint.com, executor 15): java.lang.RuntimeException: java.lang.NullPointerException
        at com.cloudera.sa.copybook.mapreduce.CopybookRecordReader.initialize(CopybookRecordReader.java:88)
        at org.apache.spark.rdd.NewHadoopRDD$$anon$1.liftedTree1$1(NewHadoopRDD.scala:182)
        at org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:179)
        at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:134)
        at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:69)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
        at org.apache.spark.scheduler.Task.run(Task.scala:108)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NullPointerException
        at net.sf.JRecord.External.CobolCopybookLoader.loadCopyBook(CobolCopybookLoader.java:142)
        at com.cloudera.sa.copybook.mapreduce.CopybookRecordReader.initialize(CopybookRecordReader.java:56)
        ... 18 more

Strange thing is, the job completes fine. Also the line numbers do not seem to match.

Update another error I see on the executors is:

java.lang.RuntimeException: The file "lexer.dat" is either missing or corrupted.
	at net.sf.cb2xml.sablecc.lexer.Lexer.<init>(Unknown Source)
	at net.sf.cb2xml.Cb2Xml.convert(Unknown Source)
	at net.sf.cb2xml.Cb2Xml.convertToXMLDOM(Unknown Source)
	at net.sf.JRecord.External.CobolCopybookLoader.loadCopyBook(CobolCopybookLoader.java:132)
	at com.cloudera.sa.copybook.mapreduce.CopybookRecordReader.initialize(CopybookRecordReader.java:56)

any thoughts?

thanks

ameet

ameet123 avatar Mar 08 '18 19:03 ameet123

cb2xml is not totally thread safe. If running multiple threads you could get this error

bmTas avatar Nov 17 '18 10:11 bmTas