shc icon indicating copy to clipboard operation
shc copied to clipboard

Exception while doing hbase scan

Open Mahesha999 opened this issue 6 years ago • 3 comments

I was trying out hbase spark distributed scan example.

My simple code looks like this:

public class DistributedHBaseScanToRddDemo {

	public static void main(String[] args) {
		JavaSparkContext jsc = getJavaSparkContext("hbasetable1");
		Configuration hbaseConf = getHbaseConf(0, "", "");
		JavaHBaseContext javaHbaseContext = new JavaHBaseContext(jsc, hbaseConf);
		
		Scan scan = new Scan();
	    scan.setCaching(100);
	    
	    JavaRDD<Tuple2<ImmutableBytesWritable, Result>> javaRdd =
	              javaHbaseContext.hbaseRDD(TableName.valueOf("hbasetable1"), scan);
	    
	    List<String> results = javaRdd.map(new ScanConvertFunction()).collect();
	    System.out.println("Result Size: " + results.size());
	}
	
	public static Configuration getHbaseConf(int pRimeout, String pQuorumIP, String pClientPort)
	{
		Configuration hbaseConf = HBaseConfiguration.create();
		hbaseConf.setInt("timeout", 120000); 
		hbaseConf.set("hbase.zookeeper.quorum", "10.56.36.14"); 
		hbaseConf.set("hbase.zookeeper.property.clientPort", "2181");
		return hbaseConf;
	}
	
	public static JavaSparkContext getJavaSparkContext(String pTableName)
	{
		SparkConf sparkConf = new SparkConf().setAppName("JavaHBaseBulkPut" + pTableName);
		sparkConf.setMaster("local");
		sparkConf.set("spark.testing.memory", "471859200");
		JavaSparkContext jsc = new JavaSparkContext(sparkConf);
		
		return jsc;
	}
	
	private static class ScanConvertFunction implements Function<Tuple2<ImmutableBytesWritable, Result>, String> {
		public String call(Tuple2<ImmutableBytesWritable, Result> v1) throws Exception {
			return Bytes.toString(v1._1().copyBytes());
		}
	}
}

I am getting following exception:

Exception in thread "main" org.apache.hadoop.hbase.DoNotRetryIOException: /10.56.48.219:16020 is unable to read call parameter from client 10.56.49.148; java.lang.UnsupportedOperationException: GetRegionLoad
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
	at org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.instantiateException(RemoteWithExtrasException.java:93)
	at org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.unwrapRemoteException(RemoteWithExtrasException.java:83)
	at org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.makeIOExceptionOfException(ProtobufUtil.java:368)
	at org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:345)
	at org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.getRegionLoad(ProtobufUtil.java:1746)
	at org.apache.hadoop.hbase.client.HBaseAdmin.getRegionLoad(HBaseAdmin.java:2089)
	at org.apache.hadoop.hbase.mapreduce.RegionSizeCalculator.init(RegionSizeCalculator.java:82)
	at org.apache.hadoop.hbase.mapreduce.RegionSizeCalculator.<init>(RegionSizeCalculator.java:60)
	at org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.oneInputSplitPerRegion(TableInputFormatBase.java:293)
	at org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:257)
	at org.apache.hadoop.hbase.mapreduce.TableInputFormat.getSplits(TableInputFormat.java:254)
	at org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:121)
	at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
	at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
	at scala.Option.getOrElse(Option.scala:121)
	at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
	at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
	at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
	at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
	at scala.Option.getOrElse(Option.scala:121)
	at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
	at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
	at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
	at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
	at scala.Option.getOrElse(Option.scala:121)
	at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:1911)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:893)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
	at org.apache.spark.rdd.RDD.withScope(RDD.scala:358)
	at org.apache.spark.rdd.RDD.collect(RDD.scala:892)
	at org.apache.spark.api.java.JavaRDDLike$class.collect(JavaRDDLike.scala:360)
	at org.apache.spark.api.java.AbstractJavaRDDLike.collect(JavaRDDLike.scala:45)
	at com.myproj.poc.sparkhbaseneo4j.DistributedHBaseScanToRddDemo.main(DistributedHBaseScanToRddDemo.java:32)
Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.DoNotRetryIOException): /10.56.48.219:16020 is unable to read call parameter from client 10.56.49.148; java.lang.UnsupportedOperationException: GetRegionLoad
	at org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:387)
	at org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$100(AbstractRpcClient.java:95)
	at org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:410)
	at org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:406)
	at org.apache.hadoop.hbase.ipc.Call.callComplete(Call.java:103)
	at org.apache.hadoop.hbase.ipc.Call.setException(Call.java:118)
	at org.apache.hadoop.hbase.ipc.NettyRpcDuplexHandler.readResponse(NettyRpcDuplexHandler.java:161)
	at org.apache.hadoop.hbase.ipc.NettyRpcDuplexHandler.channelRead(NettyRpcDuplexHandler.java:191)
	at org.apache.hadoop.hbase.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at org.apache.hadoop.hbase.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at org.apache.hadoop.hbase.shaded.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at org.apache.hadoop.hbase.shaded.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310)
	at org.apache.hadoop.hbase.shaded.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:284)
	at org.apache.hadoop.hbase.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at org.apache.hadoop.hbase.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at org.apache.hadoop.hbase.shaded.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at org.apache.hadoop.hbase.shaded.io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287)
	at org.apache.hadoop.hbase.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at org.apache.hadoop.hbase.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at org.apache.hadoop.hbase.shaded.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at org.apache.hadoop.hbase.shaded.io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1334)
	at org.apache.hadoop.hbase.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at org.apache.hadoop.hbase.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at org.apache.hadoop.hbase.shaded.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:926)
	at org.apache.hadoop.hbase.shaded.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:134)
	at org.apache.hadoop.hbase.shaded.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:644)
	at org.apache.hadoop.hbase.shaded.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:579)
	at org.apache.hadoop.hbase.shaded.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:496)
	at org.apache.hadoop.hbase.shaded.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:458)
	at org.apache.hadoop.hbase.shaded.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
	at org.apache.hadoop.hbase.shaded.io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
	at java.lang.Thread.run(Thread.java:745)

I also tried bulk get and put examples and they are working correctly. So I was guessing whats going wrong with bulk scan example.

Well, I know this question might be somewhat unfit here as its about apache hbase spark, but not that of hortonworks, but I am still posting it in the hope of getting some clue about what might be going wrong.

Mahesha999 avatar May 10 '18 11:05 Mahesha999

Looks like you forget to connect to HBase

    Configuration hadoopConf = HBaseConfiguration.create();
    Connection connection = ConnectionFactory.createConnection(hadoopConf);

See testBulkPut in https://github.com/apache/hbase/blob/master/hbase-spark/src/test/java/org/apache/hadoop/hbase/spark/TestJavaHBaseContext.java

sbarnoud avatar May 14 '18 13:05 sbarnoud

As I said, get and put are working. Without connecting to hbase, how get and put can work?

As suggested on stackoverflow, it seems that the issue was that I was using apache hbase spark api, which is built for hbase 2.x, but we have hbase 1.x deployed. I got suggestion to use api built for 1.x by cloudera, but I did not tried that, neither did I try installing hbase 2.x. Will do that soon.

But any further suggestions are welcome...

Mahesha999 avatar May 16 '18 13:05 Mahesha999

@Mahesha999 have you solve this problem ?

chenzhuo0618 avatar Nov 13 '18 08:11 chenzhuo0618