seatunnel
seatunnel copied to clipboard
[Bug] [hive-connector] ERROR commit.FileSinkAggregatedCommitter: commit aggregatedCommitInfo error java.lang.NullPointerException
Search before asking
- [X] I had searched in the issues and found no similar issues.
What happened
spark in local mode write data into hive ,then change to yarn cluster mode ,spark read fake source and write to hive ,ite shows java.lang.NullPointerException
SeaTunnel Version
2.3.0 -beta
SeaTunnel Config
env {
# You can set flink configuration here
# job.mode = "STREAMING"
execution.parallelism = 1
job.name="test_hive_source_to_hive"
}
source {
FakeSource {
row.num = 1000
schema = {
fields {
c_string = string
c_boolean = boolean
c_int = int
c_bigint = bigint
}
}
}
}
transform {
}
sink {
# choose stdout output plugin to output data to console
Hive {
table_name = "test.seatunnel_orc"
metastore_uri = "thrift://1.1.1.1:9083"
partition_by = ["c_int"]
sink_columns = ["c_string", "c_boolean", "c_bigint","c_int"]
}
}
Running Command
bin/start-seatunnel-spark-connector-v2.sh --master yarn --deploy-mode client --config config/fake_hive.conf
Error Exception
INFO hive.metastore: Connected to metastore.
22/11/04 15:48:32 ERROR commit.FileSinkAggregatedCommitter: commit aggregatedCommitInfo error
java.lang.NullPointerException
at org.apache.hadoop.fs.FileSystem.getDefaultUri(FileSystem.java:234)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:225)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:460)
at org.apache.seatunnel.connectors.seatunnel.file.sink.util.FileSystemUtils.getFileSystem(FileSystemUtils.java:42)
at org.apache.seatunnel.connectors.seatunnel.file.sink.util.FileSystemUtils.renameFile(FileSystemUtils.java:81)
at org.apache.seatunnel.connectors.seatunnel.file.sink.commit.FileSinkAggregatedCommitter.lambda$commit$0(FileSinkAggregatedCommitter.java:42)
at java.util.Collections$SingletonList.forEach(Collections.java:4822)
at org.apache.seatunnel.connectors.seatunnel.file.sink.commit.FileSinkAggregatedCommitter.commit(FileSinkAggregatedCommitter.java:37)
at org.apache.seatunnel.connectors.seatunnel.hive.commit.HiveSinkAggregatedCommitter.commit(HiveSinkAggregatedCommitter.java:49)
at org.apache.seatunnel.translation.spark.sink.SparkDataSourceWriter.commit(SparkDataSourceWriter.java:60)
at org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec.doExecute(WriteToDataSourceV2Exec.scala:76)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:136)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:132)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:160)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:157)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:132)
at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:83)
at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:81)
at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:696)
at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:696)
at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:80)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:127)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:75)
at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:696)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:280)
at org.apache.seatunnel.core.starter.spark.execution.SinkExecuteProcessor.execute(SinkExecuteProcessor.java:84)
at org.apache.seatunnel.core.starter.spark.execution.SparkExecution.execute(SparkExecution.java:56)
at org.apache.seatunnel.core.starter.spark.command.SparkApiTaskExecuteCommand.execute(SparkApiTaskExecuteCommand.java:52)
at org.apache.seatunnel.core.starter.Seatunnel.run(Seatunnel.java:39)
at org.apache.seatunnel.core.starter.spark.SeatunnelSpark.main(SeatunnelSpark.java:34)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:855)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:930)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:939)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
22/11/04 15:48:32 INFO v2.WriteToDataSourceV2Exec: Data source writer org.apache.seatunnel.translation.spark.sink.SparkDataSourceWriter@fbbd90c committed.
22/11/04 15:48:32 INFO execution.SparkExecution: Spark Execution started
22/11/04 15:48:32 INFO spark.SparkContext: Invoking stop() from shutdown hook
Flink or Spark Version
spark 2.4.8
Java or Scala Version
No response
Screenshots
No response
Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
i meet the same error
@TyrantLucifer Hi, PTAL. Thanks!
Could you please offer more details about you task? Such as create hive table sql and example data of source. BTW, Could you please try again with newest seatunnel version that compiled from dev branch? Because in pr #3258 I fixed some bugs.
This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.
Fixed by #3258