shc icon indicating copy to clipboard operation
shc copied to clipboard

NullPointerException during connection creation.

Open amadav opened this issue 9 years ago • 16 comments

I am hitting an issue while submitting an example with yarn-cluster deploy mode.

16/07/21 11:08:55 INFO yarn.ApplicationMaster: Unregistering ApplicationMaster with FAILED (diag message: User class threw exception: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3, cdh52.vm.com): java.lang.NullPointerException at org.apache.hadoop.hbase.security.UserProvider.instantiate(UserProvider.java:43) at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:214) at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:119) at org.apache.spark.sql.execution.datasources.hbase.TableResource.init(HBaseResources.scala:126) at org.apache.spark.sql.execution.datasources.hbase.ReferencedResource$class.liftedTree1$1(HBaseResources.scala:57) at org.apache.spark.sql.execution.datasources.hbase.ReferencedResource$class.acquire(HBaseResources.scala:54) at org.apache.spark.sql.execution.datasources.hbase.TableResource.acquire(HBaseResources.scala:121) at org.apache.spark.sql.execution.datasources.hbase.ReferencedResource$class.releaseOnException(HBaseResources.scala:74) at org.apache.spark.sql.execution.datasources.hbase.TableResource.releaseOnException(HBaseResources.scala:121) at org.apache.spark.sql.execution.datasources.hbase.TableResource.getScanner(HBaseResources.scala:145) at org.apache.spark.sql.execution.datasources.hbase.HBaseTableScanRDD$$anonfun$9.apply(HBaseTableScan.scala:277) at org.apache.spark.sql.execution.datasources.hbase.HBaseTableScanRDD$$anonfun$9.apply(HBaseTableScan.scala:276) at scala.collection.parallel.mutable.ParArray$Map.leaf(ParArray.scala:658) at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply$mcV$sp(Tasks.scala:54) at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply(Tasks.scala:53) at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply(Tasks.scala:53) at scala.collection.parallel.Task$class.tryLeaf(Tasks.scala:56) at scala.collection.parallel.mutable.ParArray$Map.tryLeaf(ParArray.scala:650) at scala.collection.parallel.AdaptiveWorkStealingTasks$WrappedTask$class.compute(Tasks.scala:165) at scala.collection.parallel.AdaptiveWorkStealingForkJoinTasks$WrappedTask.compute(Tasks.scala:514) at scala.concurrent.forkjoin.RecursiveAction.exec(RecursiveAction.java:160) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

I have the Hbase-site.xml in the classpath and present in the spark-conf dir too.

amadav avatar Jul 21 '16 18:07 amadav

Did you hit it in yarn-client mode? If not, please try with --files hbase-site.xml in your spark submit script.

zhzhan avatar Jul 24 '16 05:07 zhzhan

yarn-client mode works fine. I had the test run in yarn-cluster mode. I guess, the addCreds() as done in hbase-spark for the same implementation should fix it. Any comments?

amadav avatar Jul 26 '16 05:07 amadav

Try putting your hbase-site.xml in the root of your jar ( i.e. src/main/resources/hbase-site.xml )

ifilonenko avatar Aug 01 '16 04:08 ifilonenko

I am hitting this issue in yarn-client mode, but only for reading from HBase (write works). I've tried hbase-site.xml in the root of the jar and in the driver classpath.

EfraimFeinstein avatar Aug 08 '16 04:08 EfraimFeinstein

Facing same exception for hbase reads in yarn-client mode, but write works. Passing hbase-site.xml in both --files & SPARK_CLASSPATH. Also setting HADOOP_CONF_DIR=/etc/hbase/conf.

@AbhiMadav as you are able to read, could you please share parameters & any env exports you are doing?

Exception : java.lang.NullPointerException at org.apache.hadoop.hbase.security.UserProvider.instantiate(UserProvider.java:122) at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:214) at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:119)

UPDATE : Read & write both works in spark local but read fails yarn-client mode.

sudhirpatil avatar Sep 05 '16 04:09 sudhirpatil

Sorry for the late reply. Had been busy lately. I was able to get it working for yarn-client mode (read/write). hbase-site.xml has to be on the classpath if you have a property that overrides the hbase-default.xml.

@sudhirpatil HADOOP_CONF_DIR should point to a dir location where it could find all the *-site.xml's and not just hbase-site.xml. You could also create a symlink for hbase-site.xml in the /etc/hadoop/conf dir and make it HADOOP_CONF_DIR. If you are still running into issue, could you share your spark-submit command?

amadav avatar Oct 12 '16 22:10 amadav

same issue, yarn-client mode cannot read, turns out "def hbaseConf=wrappedConf.value.value" cannot be transfered to workers somhow, any suggestion? currently I directly created conf in workers as a workaround.

rayxai avatar Oct 26 '16 07:10 rayxai

For mine it was first working in Yarn-Cluster mode. Suddenly it started producing this Exception. Any idea why?

mkanchwala avatar Feb 10 '17 13:02 mkanchwala

Ok when I removed the Kryo serialization it started working as normal.

mkanchwala avatar Feb 10 '17 17:02 mkanchwala

Isn't there a way to use it with Kryo. Because JavaSerializer is so slow and sometimes unable to serialize somethings.

javrasya avatar May 25 '17 14:05 javrasya

I am trying to read & write data using Spark. I have already add all the site-* into the classpath & hbase JARs and conf in the files location. Read works fine but write gettin this exception: Exception in thread "main" java.lang.NullPointerException at org.apache.hadoop.hbase.security.UserProvider.instantiate(UserProvider.java:123) at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:214) at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:119) at org.apache.hadoop.hbase.mapreduce.TableOutputFormat.checkOutputSpecs(TableOutputFormat.java:177) at org.apache.spark.internal.io.HadoopMapReduceWriteConfigUtil.assertConf(SparkHadoopWriter.scala:387) at org.apache.spark.internal.io.SparkHadoopWriter$.write(SparkHadoopWriter.scala:71) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply$mcV$sp(PairRDDFunctions.scala:1083) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:1081) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:1081) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.rdd.RDD.withScope(RDD.scala:363) at org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopDataset(PairRDDFunctions.scala:1081) at org.apache.spark.api.java.JavaPairRDD.saveAsNewAPIHadoopDataset(JavaPairRDD.scala:831) at com.voicebase.etl.s3tohbase.HbaseScan2.main(HbaseScan2.java:148) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:879) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:197) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

alchemistsrivastava avatar May 24 '18 03:05 alchemistsrivastava

@alchemistsrivastava Hi~ Did you fix it?

webtest444 avatar Jun 01 '18 06:06 webtest444

Oh , yeah~~ I have solved it~~~

webtest444 avatar Jun 03 '18 03:06 webtest444

@webtest444 How did you solve it? I'm facing the same issue as @alchemistsrivastava has faced. Read woks fine but write is throwing the exception. I'm using IntelliJ and Stand alone HBase

swarup5s avatar Sep 12 '18 12:09 swarup5s

@webtest444 How did you solve it? I'm facing the same issue as @alchemistsrivastava has faced. Read woks fine but write is throwing the exception. I'm using IntelliJ and Stand alone HBase

Hi @swarup5s, I know that it's been a while since your comment, but I'm responding just in case the problem still exists.

I had the same issue and I fixed it by this post: https://stackoverflow.com/questions/50925942/getting-null-pointer-exception-when-running-saveasnewapihadoopdataset-in-scala-s Not sure if it's a formal way to fix, but it worked for me. I'm using spark 2.4 with hbase pseudo-cluster mode on hadoop pseudo-cluster.

Hope this helps.

moonyouj889 avatar Oct 29 '19 17:10 moonyouj889

I found that this is a bug of hbase-server. You can solve it by upgrading the hbase-server version to 2.0+. Of course, you can also add spark conf spark.hadoop.validateOutputSpecs=false to solve this problem

scxwhite avatar Jun 09 '20 08:06 scxwhite