seatunnel [Bug][Connector-V1-Spark-Hbase] When the written df is empty, the directory does not exist when the load file is loaded

[Bug][Connector-V1-Spark-Hbase] When the written df is empty, the directory does not exist when the load file is loaded

Open Carl-Zhou-CN opened this issue 2 years ago • 0 comments

Search before asking

[X] I had searched in the issues and found no similar issues.

What happened

When the written df is empty, the directory does not exist when the load file is loaded

SeaTunnel Version

dev

SeaTunnel Config

env {
  spark.app.name = "SeaTunnel"
  spark.executor.instances = 2
  spark.executor.cores = 1
  spark.executor.memory = "1g"
  spark.master = local
}

source {
  jdbc {
      driver = com.clickhouse.jdbc.ClickHouseDriver
      url = "jdbc:clickhouse://xxxxxxxxxxxxxxxxxxxxxxxxxxx",
      table = "a"
      result_table_name = "a"
      user = "xxxxx"
      password = "xxxxx"
  }
}

transform {

}


sink {
  Console {}
}

 hbase {
    source_table_name = "a"
    hbase.zookeeper.quorum = "xxxxxxxxx"
    catalog = "{\"table\":{\"namespace\":\"default\", \"name\":\"test1\"},\"rowkey\":\"col1\",\"columns\":{\"a\":{\"cf\":\"lab\", \"col\":\"a\", \"type\":\"string\"},\"uid\":{\"cf\":\"rowkey\", \"col\":\"key\", \"type\":\"string\"}}}"
    staging_dir = "/tmp/hbase-staging/"
    save_mode = "append"
}

Running Command

./bin/start-seatunnel-spark.sh --master local[*] --deploy-mode client --config test2.conf

Error Exception

at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.io.FileNotFoundException: File /tmp/hbase-staging/1666592258580 does not exist.
	at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:986)
	at org.apache.hadoop.hdfs.DistributedFileSystem.access$1000(DistributedFileSystem.java:122)
	at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1046)
	at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1043)
	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
	at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:1053)
	at org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.visitBulkHFiles(LoadIncrementalHFiles.java:982)
	at org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.discoverLoadQueue(LoadIncrementalHFiles.java:940)
	at org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.prepareHFileQueue(LoadIncrementalHFiles.java:224)
	at org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:331)
	at org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:256)
	at org.apache.seatunnel.spark.hbase.sink.Hbase.output(Hbase.scala:132)
	at org.apache.seatunnel.spark.hbase.sink.Hbase.output(Hbase.scala:41)
	at org.apache.seatunnel.spark.SparkEnvironment.sinkProcess(SparkEnvironment.java:179)
	at org.apache.seatunnel.spark.batch.SparkBatchExecution.start(SparkBatchExecution.java:54)
	at org.apache.seatunnel.core.spark.command.SparkTaskExecuteCommand.execute(SparkTaskExecuteCommand.java:67)

Flink or Spark Version

No response

Java or Scala Version

No response

Screenshots

No response

Are you willing to submit PR?

[X] Yes I am willing to submit a PR!

Code of Conduct

[X] I agree to follow this project's Code of Conduct

Oct 24 '22 06:10 Carl-Zhou-CN

seatunnel seatunnel copied to clipboard

[Bug][Connector-V1-Spark-Hbase] When the written df is empty, the directory does not exist when the load file is loaded

Search before asking

What happened

SeaTunnel Version

SeaTunnel Config

Running Command

Error Exception

Flink or Spark Version

Java or Scala Version

Screenshots

Are you willing to submit PR?

Code of Conduct

seatunnel
seatunnel copied to clipboard