zdh_server icon indicating copy to clipboard operation
zdh_server copied to clipboard

spark读取mysql数据写入hbase异常问题

Open loveZL opened this issue 2 years ago • 2 comments

原程序hbase版本为1.3.6时mysql数据无法写入hbase,错误显示找不到类! hbase升级到2.1.0后spark读取mysql数据后写入hbase时报! 报错信息如下: 2022-02-24 12:08:09:174[INFO]: [数据采集]:[HBASE]:检查表是否存在:t1 2022-02-24 12:08:09:179[INFO]: [数据采集]:[HBASE]:检查表已存在,检查 列族:cf1 2022-02-24 12:08:09:186[INFO]: [数据采集]:[HBASE]:tableDescriptor:'t1', {NAME => 'cf1', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'} 2022-02-24 12:08:09:190[INFO]: Got an error when resolving hostNames. Falling back to /default-rack for all 2022-02-24 12:08:09:189[INFO]: [数据采集]:[HBASE]:[WRITE]:writeDS:=====开始======= 2022-02-24 12:08:10:191[INFO]: Got an error when resolving hostNames. Falling back to /default-rack for all 2022-02-24 12:08:10:201[INFO]: Code generated in 308.11907 ms 2022-02-24 12:08:10:263[INFO]: [数据采集]:[HBASE]:[WRITE]:DataFrame:=====MapPartitionsRDD[3] at rdd at HbaseDataSources.scala:214 2022-02-24 12:08:10:294[INFO]: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir 2022-02-24 12:08:10:299[INFO]: Using output committer class org.apache.hadoop.mapred.FileOutputCommitter 2022-02-24 12:08:10:301[INFO]: File Output Committer Algorithm version is 2 2022-02-24 12:08:10:301[INFO]: FileOutputCommitter skip cleanup temporary folders under output directory:false, ignore cleanup failures: false 2022-02-24 12:08:10:301[WARN]: Output Path is null in setupJob() 2022-02-24 12:08:10:325[INFO]: Starting job: runJob at SparkHadoopWriter.scala:78 2022-02-24 12:08:10:341[INFO]: Got job 0 (runJob at SparkHadoopWriter.scala:78) with 1 output partitions 2022-02-24 12:08:10:342[INFO]: Final stage: ResultStage 0 (runJob at SparkHadoopWriter.scala:78) 2022-02-24 12:08:10:342[INFO]: Parents of final stage: List() 2022-02-24 12:08:10:344[INFO]: Missing parents: List() spark.rdd.scope.noOverride===true spark.jobGroup.id===946377927967121408 spark.rdd.scope==={"id":"6","name":"saveAsHadoopDataset"} spark.job.description===mysql2hbase_2022-02-24 12:07:58_946377927967121408 spark.job.interruptOnCancel===false =====jobStart.properties:{spark.rdd.scope.noOverride=true, spark.jobGroup.id=946377927967121408_, spark.rdd.scope={"id":"6","name":"saveAsHadoopDataset"}, spark.job.description=mysql2hbase_2022-02-24 12:07:58_946377927967121408, spark.job.interruptOnCancel=false} Process:null 2022-02-24 12:08:10:348[INFO]: Submitting ResultStage 0 (MapPartitionsRDD[4] at map at HbaseDataSources.scala:215), which has no missing parents 2022-02-24 12:08:10:348[ERROR]: Listener ServerSparkListener threw an exception scala.MatchError: null at com.zyc.common.ServerSparkListener.onJobStart(ServerSparkListener.scala:32) at org.apache.spark.scheduler.SparkListenerBus$class.doPostEvent(SparkListenerBus.scala:37) at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37) at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37) at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:91) at org.apache.spark.scheduler.AsyncEventQueue.org$apache$spark$scheduler$AsyncEventQueue$$super$postToAll(AsyncEventQueue.scala:92) at org.apache.spark.scheduler.AsyncEventQueue$$anonfun$org$apache$spark$scheduler$AsyncEventQueue$$dispatch$1.apply$mcJ$sp(AsyncEventQueue.scala:92)

loveZL avatar Feb 24 '22 11:02 loveZL

你好,已确认,属于bug,影响范围4.7.18及之前版本都无法完成hbase,写入, 将于5.0.0版本修复此bug, 临时解决方案如下:修改zdh_server源码,spark listener 如下图: 1645845897(1)

使用安装包的同伴,可下载4.7.10版本源码,修改此文件,然后编译,将编译好的class拷贝到jar即可,如下图: 2

zhaoyachao avatar Feb 26 '22 03:02 zhaoyachao

image 往hbase写入数据的时候获取row报空指针 image

loveZL avatar Feb 28 '22 02:02 loveZL