When the hive table storage type is orc, data sinks to the hive, and the task fails to be executed
Search before asking
- [X] I had searched in the issues and found no similar issues.
What happened
When the hive table storage type is orc, data sinks to the hive and the FieldMapper transform is configured. If certain fields in the hive table are not mapped, tasks fail to be executed
SeaTunnel Version
2.3.4
SeaTunnel Config
{
"env": {
"parallelism": 3,
"job.mode": "BATCH",
"checkpoint.interval": 30000,
"job.name": "seatunnel_1712823979630"
},
"source": [
{
"plugin_name": "Jdbc",
"result_table_name": "table_source",
"user": "postgres",
"password": "C3kk4v5_b4f2Jr",
"driver": "org.postgresql.Driver",
"url": "jdbc:postgresql://10.188.15.91:5434/gis",
"query": "select event_id,event_type,event_radius,event_source,start_time,end_time,priority,latitude,longitude,elevation,node_ids,create_time,update_time from ghcloud.gh_traffic_event_info"
}
],
"transform": [
{
"plugin_name": "FieldMapper",
"source_table_name": "table_source",
"result_table_name": "table_source_FieldMapper",
"field_mapper": {
"event_id": "event_id",
"event_type": "event_type",
"event_radius": "event_radius",
"event_source": "event_source",
"start_time": "start_time",
"end_time": "end_time",
"priority": "priority",
"latitude": "latitude",
"longitude": "longitude",
"elevation": "elevation",
"node_ids": "node_ids",
"create_time": "create_time",
"update_time": "update_time"
}
}
],
"sink": [
{
"plugin_name": "Hive",
"source_table_name": "table_source_FieldMapper",
"table_name": "gh_cloud_data_model.dwd_pub_traffic_event",
"metastore_uri": "thrift://cloudera-hadoop-61:9083"
}
]
}
Running Command
Executed by dolphin scheduler
Error Exception
SHUTDOWN
2024-04-12 11:31:30,246 INFO [s.c.s.s.c.ClientExecuteCommand] [main] - Closed SeaTunnel client......
2024-04-12 11:31:30,246 INFO [s.c.s.s.c.ClientExecuteCommand] [main] - Closed metrics executor service ......
2024-04-12 11:31:30,246 ERROR [o.a.s.c.s.SeaTunnel ] [main] -
===============================================================================
2024-04-12 11:31:30,246 ERROR [o.a.s.c.s.SeaTunnel ] [main] - Fatal Error,
2024-04-12 11:31:30,246 ERROR [o.a.s.c.s.SeaTunnel ] [main] - Please submit bug report in https://github.com/apache/seatunnel/issues
2024-04-12 11:31:30,246 ERROR [o.a.s.c.s.SeaTunnel ] [main] - Reason:SeaTunnel job executed failed
2024-04-12 11:31:30,248 ERROR [o.a.s.c.s.SeaTunnel ] [main] - Exception StackTrace:org.apache.seatunnel.core.starter.exception.CommandExecuteException: SeaTunnel job executed failed
at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:202)
at org.apache.seatunnel.core.starter.SeaTunnel.run(SeaTunnel.java:40)
at org.apache.seatunnel.core.starter.seatunnel.SeaTunnelClient.main(SeaTunnelClient.java:34)
Caused by: org.apache.seatunnel.engine.common.exception.SeaTunnelEngineException: java.lang.RuntimeException: java.lang.NullPointerException
at org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:257)
at org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:66)
at org.apache.seatunnel.engine.server.task.SeaTunnelTransformCollector.collect(SeaTunnelTransformCollector.java:39)
at org.apache.seatunnel.engine.server.task.SeaTunnelTransformCollector.collect(SeaTunnelTransformCollector.java:27)
at org.apache.seatunnel.engine.server.task.group.queue.IntermediateBlockingQueue.handleRecord(IntermediateBlockingQueue.java:75)
at org.apache.seatunnel.engine.server.task.group.queue.IntermediateBlockingQueue.collect(IntermediateBlockingQueue.java:50)
at org.apache.seatunnel.engine.server.task.flow.IntermediateQueueFlowLifeCycle.collect(IntermediateQueueFlowLifeCycle.java:51)
at org.apache.seatunnel.engine.server.task.TransformSeaTunnelTask.collect(TransformSeaTunnelTask.java:73)
at org.apache.seatunnel.engine.server.task.SeaTunnelTask.stateProcess(SeaTunnelTask.java:168)
at org.apache.seatunnel.engine.server.task.TransformSeaTunnelTask.call(TransformSeaTunnelTask.java:78)
at org.apache.seatunnel.engine.server.TaskExecutionService$BlockingWorker.run(TaskExecutionService.java:648)
at org.apache.seatunnel.engine.server.TaskExecutionService$NamedTaskWrapper.run(TaskExecutionService.java:949)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NullPointerException
at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.OrcWriteStrategy.buildSchemaWithRowType(OrcWriteStrategy.java:196)
at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.OrcWriteStrategy.getOrCreateWriter(OrcWriteStrategy.java:116)
at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.OrcWriteStrategy.write(OrcWriteStrategy.java:75)
at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.write(BaseFileSinkWriter.java:134)
at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.write(BaseFileSinkWriter.java:46)
at org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:247)
... 16 more
at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:194)
... 2 more
2024-04-12 11:31:30,248 ERROR [o.a.s.c.s.SeaTunnel ] [main] -
===============================================================================
Exception in thread "main" org.apache.seatunnel.core.starter.exception.CommandExecuteException: SeaTunnel job executed failed
at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:202)
at org.apache.seatunnel.core.starter.SeaTunnel.run(SeaTunnel.java:40)
at org.apache.seatunnel.core.starter.seatunnel.SeaTunnelClient.main(SeaTunnelClient.java:34)
Caused by: org.apache.seatunnel.engine.common.exception.SeaTunnelEngineException: java.lang.RuntimeException: java.lang.NullPointerException
at org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:257)
at org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:66)
at org.apache.seatunnel.engine.server.task.SeaTunnelTransformCollector.collect(SeaTunnelTransformCollector.java:39)
at org.apache.seatunnel.engine.server.task.SeaTunnelTransformCollector.collect(SeaTunnelTransformCollector.java:27)
at org.apache.seatunnel.engine.server.task.group.queue.IntermediateBlockingQueue.handleRecord(IntermediateBlockingQueue.java:75)
at org.apache.seatunnel.engine.server.task.group.queue.IntermediateBlockingQueue.collect(IntermediateBlockingQueue.java:50)
at org.apache.seatunnel.engine.server.task.flow.IntermediateQueueFlowLifeCycle.collect(IntermediateQueueFlowLifeCycle.java:51)
at org.apache.seatunnel.engine.server.task.TransformSeaTunnelTask.collect(TransformSeaTunnelTask.java:73)
at org.apache.seatunnel.engine.server.task.SeaTunnelTask.stateProcess(SeaTunnelTask.java:168)
at org.apache.seatunnel.engine.server.task.TransformSeaTunnelTask.call(TransformSeaTunnelTask.java:78)
at org.apache.seatunnel.engine.server.TaskExecutionService$BlockingWorker.run(TaskExecutionService.java:648)
at org.apache.seatunnel.engine.server.TaskExecutionService$NamedTaskWrapper.run(TaskExecutionService.java:949)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NullPointerException
at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.OrcWriteStrategy.buildSchemaWithRowType(OrcWriteStrategy.java:196)
at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.OrcWriteStrategy.getOrCreateWriter(OrcWriteStrategy.java:116)
at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.OrcWriteStrategy.write(OrcWriteStrategy.java:75)
at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.write(BaseFileSinkWriter.java:134)
at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.write(BaseFileSinkWriter.java:46)
at org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:247)
... 16 more
at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:194)
... 2 more
2024-04-12 11:31:30,249 INFO [s.c.s.s.c.ClientExecuteCommand] [ForkJoinPool.commonPool-worker-2] - run shutdown hook because get close signal
[INFO] 2024-04-12 11:31:30.453 +0800 - FINALIZE_SESSION
Zeta or Flink or Spark Version
No response
Java or Scala Version
No response
Screenshots
No response
Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
Please paste in the ddl statement of the [gh_cloud_data_model.dwd_pub_traffic_event table]. It is suspected that the name of the mapped field is inconsistent with that of the destination table, which causes the null pointer problem
You should pay attention to two things: one is that all fields in the [hive] table should have corresponding fields from upstream. If there are no extra fields upstream, you can pass the empty string, that is, [''], as an empty field, but you cannot specify [null] as an empty field, and the field mapping name should be the same as the field name in the table.
This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.