seatunnel
seatunnel copied to clipboard
[Bug] [spark batch] can not save to hdfs or hive
Search before asking
- [X] I had searched in the issues and found no similar issues.
What happened
I'm using apache-seatunnel-incubating-2.1.0 under hdp 3.1.4 which spark version is 2.3 when I try to save source data to hdfs I got error, my config file is :
env {
# You can set spark configuration here
# see available properties defined by spark: https://spark.apache.org/docs/latest/configuration.html#available-properties
spark.app.name = "SeaTunnel"
spark.executor.instances = 2
spark.executor.cores = 1
spark.executor.memory = "1g"
}
source {
# This is a example input plugin **only for test and demonstrate the feature input plugin**
Fake {
result_table_name = "my_dataset"
}
}
transform {
# split data by specific delimiter
# you can also use other filter plugins, such as sql
# sql {
# sql = "select * from accesslog where request_time > 1000"
# }
# If you would like to get more information about how to configure seatunnel and see full list of filter plugins,
# please go to https://seatunnel.apache.org/docs/spark/configuration/transform-plugins/Sql
}
sink {
# choose stdout output plugin to output data to console
# Console {}
file {
path = "hdfs:///tmp/datax/tmp/seatunnel"
serializer = "orc"
}
# you can also use other output plugins, such as hdfs
# hdfs {
# path = "hdfs://hadoop-cluster-01/nginx/accesslog_processed"
# save_mode = "append"
# }
# If you would like to get more information about how to configure seatunnel and see full list of output plugins,
# please go to https://seatunnel.apache.org/docs/spark/configuration/sink-plugins/Console
}
error log is :
22/04/26 19:02:33 INFO YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
22/04/26 19:02:34 INFO BlockManagerMasterEndpoint: Registering block manager worker-10-0-161-23:42231 with 408.9 MB RAM, BlockManagerId(1, worker-10-0-161-23, 42231, None)
22/04/26 19:02:34 INFO AsciiArtUtils: ********* ############## ##
22/04/26 19:02:34 INFO AsciiArtUtils: *######### ############## ##
22/04/26 19:02:34 INFO AsciiArtUtils: *#*** **** ## ##
22/04/26 19:02:34 INFO AsciiArtUtils: *#* ## ##
22/04/26 19:02:34 INFO AsciiArtUtils: *#* ## ##
22/04/26 19:02:34 INFO AsciiArtUtils: *#* ******** ******** ## ## ## ## ******* ## ******* ******** ##
22/04/26 19:02:34 INFO AsciiArtUtils: *#* **#######* ########* ## ## ## ##*######** ##*######** **#######* ##
22/04/26 19:02:34 INFO AsciiArtUtils: *#**** **#*** ***** **** ***#* ## ## ## ###******#* ###******#* **#*** ***** ##
22/04/26 19:02:34 INFO AsciiArtUtils: *###***** *#* *#* *#* ## ## ## ##* *#* ##* *#* *#* *#* ##
22/04/26 19:02:34 INFO AsciiArtUtils: ***####** *#* *#* *#* ## ## ## ##* *#* ##* *#* *#* *#* ##
22/04/26 19:02:34 INFO AsciiArtUtils: ****##* *#******###* *****####* ## ## ## ## ## ## ## *#******###* ##
22/04/26 19:02:34 INFO AsciiArtUtils: *#* *########### **######### ## ## ## ## ## ## ## *########### ##
22/04/26 19:02:34 INFO AsciiArtUtils: *#* *#* *#**** ## ## ## ## ## ## ## ## *#* ##
22/04/26 19:02:34 INFO AsciiArtUtils: *#* *#* *#* *## ## *#* *## ## ## ## ## *#* ##
22/04/26 19:02:34 INFO AsciiArtUtils: *#* *#** *#* *## ## *#* *## ## ## ## ## *#** ##
22/04/26 19:02:34 INFO AsciiArtUtils: ***** ****#* **#*** **** *#** ***### ## *#*** **### ## ## ## ## **#*** **** ##
22/04/26 19:02:34 INFO AsciiArtUtils: ##########* **######## **######*## ## **######*## ## ## ## ## **######## ##
22/04/26 19:02:34 INFO AsciiArtUtils: ********** ****#**** ******* ## ## ******* ## ## ## ## ## ****#**** ##
[INFO] 2022-04-26 19:02:35.298 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[66] - -> 22/04/26 19:02:34 INFO CodeGenerator: Code generated in 172.449391 ms
22/04/26 19:02:34 INFO SharedState: loading hive config file: file:/etc/spark2/3.1.4.0-315/0/hive-site.xml
22/04/26 19:02:34 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('/apps/spark/warehouse').
22/04/26 19:02:34 INFO SharedState: Warehouse path is '/apps/spark/warehouse'.
22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL.
22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@4571cebe{/SQL,null,AVAILABLE,@Spark}
22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/json.
22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@11cadb32{/SQL/json,null,AVAILABLE,@Spark}
22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/execution.
22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@c82d925{/SQL/execution,null,AVAILABLE,@Spark}
22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/execution/json.
22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@14df5253{/SQL/execution/json,null,AVAILABLE,@Spark}
22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /static/sql.
22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@667a467f{/static/sql,null,AVAILABLE,@Spark}
22/04/26 19:02:34 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint
22/04/26 19:02:34 INFO StreamingQueryManager: Registered listener com.hortonworks.spark.atlas.SparkAtlasStreamingQueryEventTracker
[INFO] 2022-04-26 19:02:36.196 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[374] - find app id: application_1644825367082_36847
[INFO] 2022-04-26 19:02:36.197 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[202] - process has exited, execute path:/tmp/dolphinscheduler/exec/process/3755309710720/5316427070080_4/55019/224404, processId:150031 ,exitStatusCode:1 ,processWaitForStatus:true ,processExitValue:1
[INFO] 2022-04-26 19:02:36.299 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[66] - -> 22/04/26 19:02:35 WARN SparkExecutionPlanProcessor: Caught exception during parsing event
java.lang.ClassCastException: org.apache.spark.sql.catalyst.plans.logical.AnalysisBarrier cannot be cast to org.apache.spark.sql.catalyst.plans.logical.Project
at com.hortonworks.spark.atlas.sql.CommandsHarvester$CreateViewHarvester$.harvest(CommandsHarvester.scala:195)
at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor$$anonfun$2.apply(SparkExecutionPlanProcessor.scala:65)
at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor$$anonfun$2.apply(SparkExecutionPlanProcessor.scala:54)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104)
at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor.process(SparkExecutionPlanProcessor.scala:54)
at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor.process(SparkExecutionPlanProcessor.scala:41)
at com.hortonworks.spark.atlas.AbstractEventProcessor$$anonfun$eventProcess$1.apply(AbstractEventProcessor.scala:67)
at com.hortonworks.spark.atlas.AbstractEventProcessor$$anonfun$eventProcess$1.apply(AbstractEventProcessor.scala:66)
at scala.Option.foreach(Option.scala:257)
at com.hortonworks.spark.atlas.AbstractEventProcessor.eventProcess(AbstractEventProcessor.scala:66)
at com.hortonworks.spark.atlas.AbstractEventProcessor$$anon$1.run(AbstractEventProcessor.scala:39)
22/04/26 19:02:35 ERROR Seatunnel:
===============================================================================
22/04/26 19:02:35 ERROR Seatunnel: Fatal Error,
22/04/26 19:02:35 ERROR Seatunnel: Please submit bug report in https://github.com/apache/incubator-seatunnel/issues
22/04/26 19:02:35 ERROR Seatunnel: Reason:Illegal pattern character 'p'
22/04/26 19:02:35 ERROR Seatunnel: Exception StackTrace:java.lang.IllegalArgumentException: Illegal pattern character 'p'
at java.text.SimpleDateFormat.compile(SimpleDateFormat.java:826)
at java.text.SimpleDateFormat.initialize(SimpleDateFormat.java:634)
at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:605)
at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:580)
at org.apache.seatunnel.common.utils.StringTemplate.substitute(StringTemplate.java:40)
at org.apache.seatunnel.spark.sink.File.output(File.scala:64)
at org.apache.seatunnel.spark.sink.File.output(File.scala:34)
at org.apache.seatunnel.spark.batch.SparkBatchExecution.sinkProcess(SparkBatchExecution.java:90)
at org.apache.seatunnel.spark.batch.SparkBatchExecution.start(SparkBatchExecution.java:105)
at org.apache.seatunnel.Seatunnel.entryPoint(Seatunnel.java:107)
at org.apache.seatunnel.Seatunnel.run(Seatunnel.java:65)
at org.apache.seatunnel.SeatunnelSpark.main(SeatunnelSpark.java:29)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:904)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
22/04/26 19:02:35 ERROR Seatunnel:
===============================================================================
Exception in thread "main" java.lang.IllegalArgumentException: Illegal pattern character 'p'
at java.text.SimpleDateFormat.compile(SimpleDateFormat.java:826)
at java.text.SimpleDateFormat.initialize(SimpleDateFormat.java:634)
at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:605)
at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:580)
at org.apache.seatunnel.common.utils.StringTemplate.substitute(StringTemplate.java:40)
at org.apache.seatunnel.spark.sink.File.output(File.scala:64)
at org.apache.seatunnel.spark.sink.File.output(File.scala:34)
at org.apache.seatunnel.spark.batch.SparkBatchExecution.sinkProcess(SparkBatchExecution.java:90)
at org.apache.seatunnel.spark.batch.SparkBatchExecution.start(SparkBatchExecution.java:105)
at org.apache.seatunnel.Seatunnel.entryPoint(Seatunnel.java:107)
at org.apache.seatunnel.Seatunnel.run(Seatunnel.java:65)
at org.apache.seatunnel.SeatunnelSpark.main(SeatunnelSpark.java:29)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:904)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
22/04/26 19:02:35 INFO SparkContext: Invoking stop() from shutdown hook
22/04/26 19:02:35 INFO AbstractConnector: Stopped Spark@a0a9fa5{HTTP/1.1,[http/1.1]}{0.0.0.0:4043}
22/04/26 19:02:35 INFO SparkUI: Stopped Spark web UI at http://client-10-0-161-28:4043
22/04/26 19:02:35 INFO YarnClientSchedulerBackend: Interrupting monitor thread
22/04/26 19:02:35 INFO YarnClientSchedulerBackend: Shutting down all executors
22/04/26 19:02:35 INFO YarnSchedulerBackend$YarnDriverEndpoint: Asking each executor to shut down
22/04/26 19:02:35 INFO SchedulerExtensionServices: Stopping SchedulerExtensionServices
(serviceOption=None,
services=List(),
started=false)
22/04/26 19:02:35 INFO YarnClientSchedulerBackend: Stopped
22/04/26 19:02:35 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
22/04/26 19:02:35 INFO MemoryStore: MemoryStore cleared
22/04/26 19:02:35 INFO BlockManager: BlockManager stopped
22/04/26 19:02:35 INFO BlockManagerMaster: BlockManagerMaster stopped
22/04/26 19:02:35 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
22/04/26 19:02:35 INFO SparkContext: Successfully stopped SparkContext
22/04/26 19:02:35 INFO ShutdownHookManager: Shutdown hook called
22/04/26 19:02:35 INFO ShutdownHookManager: Deleting directory /tmp/spark-a105076b-2107-4c1d-85aa-df3b26674baf
22/04/26 19:02:35 INFO ShutdownHookManager: Deleting directory /tmp/spark-0f34e42e-88f2-48df-bb2b-fb3a2028214e
22/04/26 19:02:35 INFO AtlasHook: ==> Shutdown of Atlas Hook
22/04/26 19:02:35 INFO AtlasHook: <== Shutdown of Atlas Hook
[INFO] 2022-04-26 19:02:36.302 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[60] - FINALIZE_SESSION
SeaTunnel Version
2.1.0
SeaTunnel Config
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
######
###### This config file is a demonstration of batch processing in SeaTunnel config
######
env {
# You can set spark configuration here
# see available properties defined by spark: https://spark.apache.org/docs/latest/configuration.html#available-properties
spark.app.name = "SeaTunnel"
spark.executor.instances = 2
spark.executor.cores = 1
spark.executor.memory = "1g"
}
source {
# This is a example input plugin **only for test and demonstrate the feature input plugin**
Fake {
result_table_name = "my_dataset"
}
# You can also use other input plugins, such as file
# file {
# result_table_name = "accesslog"
# path = "hdfs://hadoop-cluster-01/nginx/accesslog"
# format = "json"
# }
# If you would like to get more information about how to configure seatunnel and see full list of input plugins,
# please go to https://seatunnel.apache.org/docs/spark/configuration/source-plugins/Fake
}
transform {
# split data by specific delimiter
# you can also use other filter plugins, such as sql
# sql {
# sql = "select * from accesslog where request_time > 1000"
# }
# If you would like to get more information about how to configure seatunnel and see full list of filter plugins,
# please go to https://seatunnel.apache.org/docs/spark/configuration/transform-plugins/Sql
}
sink {
# choose stdout output plugin to output data to console
# Console {}
file {
path = "hdfs:///tmp/datax/tmp/seatunnel"
serializer = "orc"
}
# you can also use other output plugins, such as hdfs
# hdfs {
# path = "hdfs://hadoop-cluster-01/nginx/accesslog_processed"
# save_mode = "append"
# }
# If you would like to get more information about how to configure seatunnel and see full list of output plugins,
# please go to https://seatunnel.apache.org/docs/spark/configuration/sink-plugins/Console
}
Running Command
sh /data/soft/seatunnel/seatunnel/bin/start-seatunnel-spark.sh --master yarn --deploy-mode client --config /data/scripts/datax/hdp-doc/scripts/sk/spark.batch.conf
Error Exception
22/04/26 19:02:33 INFO YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
22/04/26 19:02:34 INFO BlockManagerMasterEndpoint: Registering block manager worker-10-0-161-23:42231 with 408.9 MB RAM, BlockManagerId(1, worker-10-0-161-23, 42231, None)
22/04/26 19:02:34 INFO AsciiArtUtils: ********* ############## ##
22/04/26 19:02:34 INFO AsciiArtUtils: *######### ############## ##
22/04/26 19:02:34 INFO AsciiArtUtils: *#*** **** ## ##
22/04/26 19:02:34 INFO AsciiArtUtils: *#* ## ##
22/04/26 19:02:34 INFO AsciiArtUtils: *#* ## ##
22/04/26 19:02:34 INFO AsciiArtUtils: *#* ******** ******** ## ## ## ## ******* ## ******* ******** ##
22/04/26 19:02:34 INFO AsciiArtUtils: *#* **#######* ########* ## ## ## ##*######** ##*######** **#######* ##
22/04/26 19:02:34 INFO AsciiArtUtils: *#**** **#*** ***** **** ***#* ## ## ## ###******#* ###******#* **#*** ***** ##
22/04/26 19:02:34 INFO AsciiArtUtils: *###***** *#* *#* *#* ## ## ## ##* *#* ##* *#* *#* *#* ##
22/04/26 19:02:34 INFO AsciiArtUtils: ***####** *#* *#* *#* ## ## ## ##* *#* ##* *#* *#* *#* ##
22/04/26 19:02:34 INFO AsciiArtUtils: ****##* *#******###* *****####* ## ## ## ## ## ## ## *#******###* ##
22/04/26 19:02:34 INFO AsciiArtUtils: *#* *########### **######### ## ## ## ## ## ## ## *########### ##
22/04/26 19:02:34 INFO AsciiArtUtils: *#* *#* *#**** ## ## ## ## ## ## ## ## *#* ##
22/04/26 19:02:34 INFO AsciiArtUtils: *#* *#* *#* *## ## *#* *## ## ## ## ## *#* ##
22/04/26 19:02:34 INFO AsciiArtUtils: *#* *#** *#* *## ## *#* *## ## ## ## ## *#** ##
22/04/26 19:02:34 INFO AsciiArtUtils: ***** ****#* **#*** **** *#** ***### ## *#*** **### ## ## ## ## **#*** **** ##
22/04/26 19:02:34 INFO AsciiArtUtils: ##########* **######## **######*## ## **######*## ## ## ## ## **######## ##
22/04/26 19:02:34 INFO AsciiArtUtils: ********** ****#**** ******* ## ## ******* ## ## ## ## ## ****#**** ##
[INFO] 2022-04-26 19:02:35.298 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[66] - -> 22/04/26 19:02:34 INFO CodeGenerator: Code generated in 172.449391 ms
22/04/26 19:02:34 INFO SharedState: loading hive config file: file:/etc/spark2/3.1.4.0-315/0/hive-site.xml
22/04/26 19:02:34 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('/apps/spark/warehouse').
22/04/26 19:02:34 INFO SharedState: Warehouse path is '/apps/spark/warehouse'.
22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL.
22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@4571cebe{/SQL,null,AVAILABLE,@Spark}
22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/json.
22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@11cadb32{/SQL/json,null,AVAILABLE,@Spark}
22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/execution.
22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@c82d925{/SQL/execution,null,AVAILABLE,@Spark}
22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/execution/json.
22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@14df5253{/SQL/execution/json,null,AVAILABLE,@Spark}
22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /static/sql.
22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@667a467f{/static/sql,null,AVAILABLE,@Spark}
22/04/26 19:02:34 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint
22/04/26 19:02:34 INFO StreamingQueryManager: Registered listener com.hortonworks.spark.atlas.SparkAtlasStreamingQueryEventTracker
[INFO] 2022-04-26 19:02:36.196 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[374] - find app id: application_1644825367082_36847
[INFO] 2022-04-26 19:02:36.197 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[202] - process has exited, execute path:/tmp/dolphinscheduler/exec/process/3755309710720/5316427070080_4/55019/224404, processId:150031 ,exitStatusCode:1 ,processWaitForStatus:true ,processExitValue:1
[INFO] 2022-04-26 19:02:36.299 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[66] - -> 22/04/26 19:02:35 WARN SparkExecutionPlanProcessor: Caught exception during parsing event
java.lang.ClassCastException: org.apache.spark.sql.catalyst.plans.logical.AnalysisBarrier cannot be cast to org.apache.spark.sql.catalyst.plans.logical.Project
at com.hortonworks.spark.atlas.sql.CommandsHarvester$CreateViewHarvester$.harvest(CommandsHarvester.scala:195)
at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor$$anonfun$2.apply(SparkExecutionPlanProcessor.scala:65)
at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor$$anonfun$2.apply(SparkExecutionPlanProcessor.scala:54)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104)
at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor.process(SparkExecutionPlanProcessor.scala:54)
at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor.process(SparkExecutionPlanProcessor.scala:41)
at com.hortonworks.spark.atlas.AbstractEventProcessor$$anonfun$eventProcess$1.apply(AbstractEventProcessor.scala:67)
at com.hortonworks.spark.atlas.AbstractEventProcessor$$anonfun$eventProcess$1.apply(AbstractEventProcessor.scala:66)
at scala.Option.foreach(Option.scala:257)
at com.hortonworks.spark.atlas.AbstractEventProcessor.eventProcess(AbstractEventProcessor.scala:66)
at com.hortonworks.spark.atlas.AbstractEventProcessor$$anon$1.run(AbstractEventProcessor.scala:39)
22/04/26 19:02:35 ERROR Seatunnel:
===============================================================================
22/04/26 19:02:35 ERROR Seatunnel: Fatal Error,
22/04/26 19:02:35 ERROR Seatunnel: Please submit bug report in https://github.com/apache/incubator-seatunnel/issues
22/04/26 19:02:35 ERROR Seatunnel: Reason:Illegal pattern character 'p'
22/04/26 19:02:35 ERROR Seatunnel: Exception StackTrace:java.lang.IllegalArgumentException: Illegal pattern character 'p'
at java.text.SimpleDateFormat.compile(SimpleDateFormat.java:826)
at java.text.SimpleDateFormat.initialize(SimpleDateFormat.java:634)
at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:605)
at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:580)
at org.apache.seatunnel.common.utils.StringTemplate.substitute(StringTemplate.java:40)
at org.apache.seatunnel.spark.sink.File.output(File.scala:64)
at org.apache.seatunnel.spark.sink.File.output(File.scala:34)
at org.apache.seatunnel.spark.batch.SparkBatchExecution.sinkProcess(SparkBatchExecution.java:90)
at org.apache.seatunnel.spark.batch.SparkBatchExecution.start(SparkBatchExecution.java:105)
at org.apache.seatunnel.Seatunnel.entryPoint(Seatunnel.java:107)
at org.apache.seatunnel.Seatunnel.run(Seatunnel.java:65)
at org.apache.seatunnel.SeatunnelSpark.main(SeatunnelSpark.java:29)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:904)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
22/04/26 19:02:35 ERROR Seatunnel:
===============================================================================
Exception in thread "main" java.lang.IllegalArgumentException: Illegal pattern character 'p'
at java.text.SimpleDateFormat.compile(SimpleDateFormat.java:826)
at java.text.SimpleDateFormat.initialize(SimpleDateFormat.java:634)
at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:605)
at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:580)
at org.apache.seatunnel.common.utils.StringTemplate.substitute(StringTemplate.java:40)
at org.apache.seatunnel.spark.sink.File.output(File.scala:64)
at org.apache.seatunnel.spark.sink.File.output(File.scala:34)
at org.apache.seatunnel.spark.batch.SparkBatchExecution.sinkProcess(SparkBatchExecution.java:90)
at org.apache.seatunnel.spark.batch.SparkBatchExecution.start(SparkBatchExecution.java:105)
at org.apache.seatunnel.Seatunnel.entryPoint(Seatunnel.java:107)
at org.apache.seatunnel.Seatunnel.run(Seatunnel.java:65)
at org.apache.seatunnel.SeatunnelSpark.main(SeatunnelSpark.java:29)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:904)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
22/04/26 19:02:35 INFO SparkContext: Invoking stop() from shutdown hook
22/04/26 19:02:35 INFO AbstractConnector: Stopped Spark@a0a9fa5{HTTP/1.1,[http/1.1]}{0.0.0.0:4043}
22/04/26 19:02:35 INFO SparkUI: Stopped Spark web UI at http://client-10-0-161-28:4043
22/04/26 19:02:35 INFO YarnClientSchedulerBackend: Interrupting monitor thread
22/04/26 19:02:35 INFO YarnClientSchedulerBackend: Shutting down all executors
22/04/26 19:02:35 INFO YarnSchedulerBackend$YarnDriverEndpoint: Asking each executor to shut down
22/04/26 19:02:35 INFO SchedulerExtensionServices: Stopping SchedulerExtensionServices
(serviceOption=None,
services=List(),
started=false)
22/04/26 19:02:35 INFO YarnClientSchedulerBackend: Stopped
22/04/26 19:02:35 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
22/04/26 19:02:35 INFO MemoryStore: MemoryStore cleared
22/04/26 19:02:35 INFO BlockManager: BlockManager stopped
22/04/26 19:02:35 INFO BlockManagerMaster: BlockManagerMaster stopped
22/04/26 19:02:35 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
22/04/26 19:02:35 INFO SparkContext: Successfully stopped SparkContext
22/04/26 19:02:35 INFO ShutdownHookManager: Shutdown hook called
22/04/26 19:02:35 INFO ShutdownHookManager: Deleting directory /tmp/spark-a105076b-2107-4c1d-85aa-df3b26674baf
22/04/26 19:02:35 INFO ShutdownHookManager: Deleting directory /tmp/spark-0f34e42e-88f2-48df-bb2b-fb3a2028214e
22/04/26 19:02:35 INFO AtlasHook: ==> Shutdown of Atlas Hook
22/04/26 19:02:35 INFO AtlasHook: <== Shutdown of Atlas Hook
[INFO] 2022-04-26 19:02:36.302 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[60] - FINALIZE_SESSION
Flink or Spark Version
spark 2.3.0
Java or Scala Version
No response
Screenshots
No response
Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
22/04/26 19:02:34 INFO SharedState: loading hive config file: file:/etc/spark2/3.1.4.0-315/0/hive-site.xml
22/04/26 19:02:34 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('/apps/spark/warehouse').
also why in my hive-site.xml , the hive.metastore.warehouse.dir is not null ,but seatunnel set it to spark default warehouse dir , it result that if I save data to hive got database not found Exception.
SeaTunnel support spark version is 2.4.0, can you use spark 2.4.0 try again?
i got the same issue. Spark: spark-2.4.8-bin-hadoop2.6 Hadoop3
22/04/27 20:23:23 INFO config.ConfigBuilder: parsed config file: {
"env" : {
"spark.app.name" : "SeaTunnel",
"spark.executor.cores" : 3,
"spark.executor.instances" : 3,
"spark.executor.memory" : "8g",
"spark.sql.catalogImplementation" : "hive"
},
"sink" : [
{
"path" : "file:///tmp/",
"plugin_name" : "file",
"serializer" : "csv"
}
],
"source" : [
{
"plugin_name" : "hive",
"pre_sql" : "select * from tbl where day='2022-04-24' limit 10000",
"result_table_name" : "my_dataset"
}
],
"transform" : []
}
stack:
22/04/27 20:24:26 ERROR seatunnel.Seatunnel: Exception StackTrace:java.lang.IllegalArgumentException: Illegal pattern character 'p'
at java.text.SimpleDateFormat.compile(SimpleDateFormat.java:826)
at java.text.SimpleDateFormat.initialize(SimpleDateFormat.java:634)
at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:605)
at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:580)
at org.apache.seatunnel.common.utils.StringTemplate.substitute(StringTemplate.java:40)
at org.apache.seatunnel.spark.sink.File.output(File.scala:64)
at org.apache.seatunnel.spark.sink.File.output(File.scala:34)
at org.apache.seatunnel.spark.batch.SparkBatchExecution.sinkProcess(SparkBatchExecution.java:90)
at org.apache.seatunnel.spark.batch.SparkBatchExecution.start(SparkBatchExecution.java:105)
at org.apache.seatunnel.Seatunnel.entryPoint(Seatunnel.java:107)
at org.apache.seatunnel.Seatunnel.run(Seatunnel.java:65)
at org.apache.seatunnel.SeatunnelSpark.main(SeatunnelSpark.java:29)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:855)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:930)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:939)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)```
Search before asking
- [x] I had searched in the issues and found no similar issues.
What happened
I'm using apache-seatunnel-incubating-2.1.0 under hdp 3.1.4 which spark version is 2.3 when I try to save source data to hdfs I got error, my config file is :
env { # You can set spark configuration here # see available properties defined by spark: https://spark.apache.org/docs/latest/configuration.html#available-properties spark.app.name = "SeaTunnel" spark.executor.instances = 2 spark.executor.cores = 1 spark.executor.memory = "1g" } source { # This is a example input plugin **only for test and demonstrate the feature input plugin** Fake { result_table_name = "my_dataset" } } transform { # split data by specific delimiter # you can also use other filter plugins, such as sql # sql { # sql = "select * from accesslog where request_time > 1000" # } # If you would like to get more information about how to configure seatunnel and see full list of filter plugins, # please go to https://seatunnel.apache.org/docs/spark/configuration/transform-plugins/Sql } sink { # choose stdout output plugin to output data to console # Console {} file { path = "hdfs:///tmp/datax/tmp/seatunnel" serializer = "orc" } # you can also use other output plugins, such as hdfs # hdfs { # path = "hdfs://hadoop-cluster-01/nginx/accesslog_processed" # save_mode = "append" # } # If you would like to get more information about how to configure seatunnel and see full list of output plugins, # please go to https://seatunnel.apache.org/docs/spark/configuration/sink-plugins/Console }
error log is :
22/04/26 19:02:33 INFO YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8 22/04/26 19:02:34 INFO BlockManagerMasterEndpoint: Registering block manager worker-10-0-161-23:42231 with 408.9 MB RAM, BlockManagerId(1, worker-10-0-161-23, 42231, None) 22/04/26 19:02:34 INFO AsciiArtUtils: ********* ############## ## 22/04/26 19:02:34 INFO AsciiArtUtils: *######### ############## ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#*** **** ## ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* ## ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* ## ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* ******** ******** ## ## ## ## ******* ## ******* ******** ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* **#######* ########* ## ## ## ##*######** ##*######** **#######* ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#**** **#*** ***** **** ***#* ## ## ## ###******#* ###******#* **#*** ***** ## 22/04/26 19:02:34 INFO AsciiArtUtils: *###***** *#* *#* *#* ## ## ## ##* *#* ##* *#* *#* *#* ## 22/04/26 19:02:34 INFO AsciiArtUtils: ***####** *#* *#* *#* ## ## ## ##* *#* ##* *#* *#* *#* ## 22/04/26 19:02:34 INFO AsciiArtUtils: ****##* *#******###* *****####* ## ## ## ## ## ## ## *#******###* ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* *########### **######### ## ## ## ## ## ## ## *########### ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* *#* *#**** ## ## ## ## ## ## ## ## *#* ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* *#* *#* *## ## *#* *## ## ## ## ## *#* ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* *#** *#* *## ## *#* *## ## ## ## ## *#** ## 22/04/26 19:02:34 INFO AsciiArtUtils: ***** ****#* **#*** **** *#** ***### ## *#*** **### ## ## ## ## **#*** **** ## 22/04/26 19:02:34 INFO AsciiArtUtils: ##########* **######## **######*## ## **######*## ## ## ## ## **######## ## 22/04/26 19:02:34 INFO AsciiArtUtils: ********** ****#**** ******* ## ## ******* ## ## ## ## ## ****#**** ## [INFO] 2022-04-26 19:02:35.298 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[66] - -> 22/04/26 19:02:34 INFO CodeGenerator: Code generated in 172.449391 ms 22/04/26 19:02:34 INFO SharedState: loading hive config file: file:/etc/spark2/3.1.4.0-315/0/hive-site.xml 22/04/26 19:02:34 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('/apps/spark/warehouse'). 22/04/26 19:02:34 INFO SharedState: Warehouse path is '/apps/spark/warehouse'. 22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL. 22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@4571cebe{/SQL,null,AVAILABLE,@Spark} 22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/json. 22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@11cadb32{/SQL/json,null,AVAILABLE,@Spark} 22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/execution. 22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@c82d925{/SQL/execution,null,AVAILABLE,@Spark} 22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/execution/json. 22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@14df5253{/SQL/execution/json,null,AVAILABLE,@Spark} 22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /static/sql. 22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@667a467f{/static/sql,null,AVAILABLE,@Spark} 22/04/26 19:02:34 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint 22/04/26 19:02:34 INFO StreamingQueryManager: Registered listener com.hortonworks.spark.atlas.SparkAtlasStreamingQueryEventTracker [INFO] 2022-04-26 19:02:36.196 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[374] - find app id: application_1644825367082_36847 [INFO] 2022-04-26 19:02:36.197 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[202] - process has exited, execute path:/tmp/dolphinscheduler/exec/process/3755309710720/5316427070080_4/55019/224404, processId:150031 ,exitStatusCode:1 ,processWaitForStatus:true ,processExitValue:1 [INFO] 2022-04-26 19:02:36.299 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[66] - -> 22/04/26 19:02:35 WARN SparkExecutionPlanProcessor: Caught exception during parsing event java.lang.ClassCastException: org.apache.spark.sql.catalyst.plans.logical.AnalysisBarrier cannot be cast to org.apache.spark.sql.catalyst.plans.logical.Project at com.hortonworks.spark.atlas.sql.CommandsHarvester$CreateViewHarvester$.harvest(CommandsHarvester.scala:195) at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor$$anonfun$2.apply(SparkExecutionPlanProcessor.scala:65) at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor$$anonfun$2.apply(SparkExecutionPlanProcessor.scala:54) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104) at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor.process(SparkExecutionPlanProcessor.scala:54) at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor.process(SparkExecutionPlanProcessor.scala:41) at com.hortonworks.spark.atlas.AbstractEventProcessor$$anonfun$eventProcess$1.apply(AbstractEventProcessor.scala:67) at com.hortonworks.spark.atlas.AbstractEventProcessor$$anonfun$eventProcess$1.apply(AbstractEventProcessor.scala:66) at scala.Option.foreach(Option.scala:257) at com.hortonworks.spark.atlas.AbstractEventProcessor.eventProcess(AbstractEventProcessor.scala:66) at com.hortonworks.spark.atlas.AbstractEventProcessor$$anon$1.run(AbstractEventProcessor.scala:39) 22/04/26 19:02:35 ERROR Seatunnel: =============================================================================== 22/04/26 19:02:35 ERROR Seatunnel: Fatal Error, 22/04/26 19:02:35 ERROR Seatunnel: Please submit bug report in https://github.com/apache/incubator-seatunnel/issues 22/04/26 19:02:35 ERROR Seatunnel: Reason:Illegal pattern character 'p' 22/04/26 19:02:35 ERROR Seatunnel: Exception StackTrace:java.lang.IllegalArgumentException: Illegal pattern character 'p' at java.text.SimpleDateFormat.compile(SimpleDateFormat.java:826) at java.text.SimpleDateFormat.initialize(SimpleDateFormat.java:634) at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:605) at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:580) at org.apache.seatunnel.common.utils.StringTemplate.substitute(StringTemplate.java:40) at org.apache.seatunnel.spark.sink.File.output(File.scala:64) at org.apache.seatunnel.spark.sink.File.output(File.scala:34) at org.apache.seatunnel.spark.batch.SparkBatchExecution.sinkProcess(SparkBatchExecution.java:90) at org.apache.seatunnel.spark.batch.SparkBatchExecution.start(SparkBatchExecution.java:105) at org.apache.seatunnel.Seatunnel.entryPoint(Seatunnel.java:107) at org.apache.seatunnel.Seatunnel.run(Seatunnel.java:65) at org.apache.seatunnel.SeatunnelSpark.main(SeatunnelSpark.java:29) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:904) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 22/04/26 19:02:35 ERROR Seatunnel: =============================================================================== Exception in thread "main" java.lang.IllegalArgumentException: Illegal pattern character 'p' at java.text.SimpleDateFormat.compile(SimpleDateFormat.java:826) at java.text.SimpleDateFormat.initialize(SimpleDateFormat.java:634) at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:605) at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:580) at org.apache.seatunnel.common.utils.StringTemplate.substitute(StringTemplate.java:40) at org.apache.seatunnel.spark.sink.File.output(File.scala:64) at org.apache.seatunnel.spark.sink.File.output(File.scala:34) at org.apache.seatunnel.spark.batch.SparkBatchExecution.sinkProcess(SparkBatchExecution.java:90) at org.apache.seatunnel.spark.batch.SparkBatchExecution.start(SparkBatchExecution.java:105) at org.apache.seatunnel.Seatunnel.entryPoint(Seatunnel.java:107) at org.apache.seatunnel.Seatunnel.run(Seatunnel.java:65) at org.apache.seatunnel.SeatunnelSpark.main(SeatunnelSpark.java:29) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:904) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 22/04/26 19:02:35 INFO SparkContext: Invoking stop() from shutdown hook 22/04/26 19:02:35 INFO AbstractConnector: Stopped Spark@a0a9fa5{HTTP/1.1,[http/1.1]}{0.0.0.0:4043} 22/04/26 19:02:35 INFO SparkUI: Stopped Spark web UI at http://client-10-0-161-28:4043 22/04/26 19:02:35 INFO YarnClientSchedulerBackend: Interrupting monitor thread 22/04/26 19:02:35 INFO YarnClientSchedulerBackend: Shutting down all executors 22/04/26 19:02:35 INFO YarnSchedulerBackend$YarnDriverEndpoint: Asking each executor to shut down 22/04/26 19:02:35 INFO SchedulerExtensionServices: Stopping SchedulerExtensionServices (serviceOption=None, services=List(), started=false) 22/04/26 19:02:35 INFO YarnClientSchedulerBackend: Stopped 22/04/26 19:02:35 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 22/04/26 19:02:35 INFO MemoryStore: MemoryStore cleared 22/04/26 19:02:35 INFO BlockManager: BlockManager stopped 22/04/26 19:02:35 INFO BlockManagerMaster: BlockManagerMaster stopped 22/04/26 19:02:35 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 22/04/26 19:02:35 INFO SparkContext: Successfully stopped SparkContext 22/04/26 19:02:35 INFO ShutdownHookManager: Shutdown hook called 22/04/26 19:02:35 INFO ShutdownHookManager: Deleting directory /tmp/spark-a105076b-2107-4c1d-85aa-df3b26674baf 22/04/26 19:02:35 INFO ShutdownHookManager: Deleting directory /tmp/spark-0f34e42e-88f2-48df-bb2b-fb3a2028214e 22/04/26 19:02:35 INFO AtlasHook: ==> Shutdown of Atlas Hook 22/04/26 19:02:35 INFO AtlasHook: <== Shutdown of Atlas Hook [INFO] 2022-04-26 19:02:36.302 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[60] - FINALIZE_SESSION
SeaTunnel Version
2.1.0
SeaTunnel Config
# # Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # ###### ###### This config file is a demonstration of batch processing in SeaTunnel config ###### env { # You can set spark configuration here # see available properties defined by spark: https://spark.apache.org/docs/latest/configuration.html#available-properties spark.app.name = "SeaTunnel" spark.executor.instances = 2 spark.executor.cores = 1 spark.executor.memory = "1g" } source { # This is a example input plugin **only for test and demonstrate the feature input plugin** Fake { result_table_name = "my_dataset" } # You can also use other input plugins, such as file # file { # result_table_name = "accesslog" # path = "hdfs://hadoop-cluster-01/nginx/accesslog" # format = "json" # } # If you would like to get more information about how to configure seatunnel and see full list of input plugins, # please go to https://seatunnel.apache.org/docs/spark/configuration/source-plugins/Fake } transform { # split data by specific delimiter # you can also use other filter plugins, such as sql # sql { # sql = "select * from accesslog where request_time > 1000" # } # If you would like to get more information about how to configure seatunnel and see full list of filter plugins, # please go to https://seatunnel.apache.org/docs/spark/configuration/transform-plugins/Sql } sink { # choose stdout output plugin to output data to console # Console {} file { path = "hdfs:///tmp/datax/tmp/seatunnel" serializer = "orc" } # you can also use other output plugins, such as hdfs # hdfs { # path = "hdfs://hadoop-cluster-01/nginx/accesslog_processed" # save_mode = "append" # } # If you would like to get more information about how to configure seatunnel and see full list of output plugins, # please go to https://seatunnel.apache.org/docs/spark/configuration/sink-plugins/Console }
Running Command
sh /data/soft/seatunnel/seatunnel/bin/start-seatunnel-spark.sh --master yarn --deploy-mode client --config /data/scripts/datax/hdp-doc/scripts/sk/spark.batch.conf
Error Exception
22/04/26 19:02:33 INFO YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8 22/04/26 19:02:34 INFO BlockManagerMasterEndpoint: Registering block manager worker-10-0-161-23:42231 with 408.9 MB RAM, BlockManagerId(1, worker-10-0-161-23, 42231, None) 22/04/26 19:02:34 INFO AsciiArtUtils: ********* ############## ## 22/04/26 19:02:34 INFO AsciiArtUtils: *######### ############## ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#*** **** ## ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* ## ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* ## ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* ******** ******** ## ## ## ## ******* ## ******* ******** ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* **#######* ########* ## ## ## ##*######** ##*######** **#######* ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#**** **#*** ***** **** ***#* ## ## ## ###******#* ###******#* **#*** ***** ## 22/04/26 19:02:34 INFO AsciiArtUtils: *###***** *#* *#* *#* ## ## ## ##* *#* ##* *#* *#* *#* ## 22/04/26 19:02:34 INFO AsciiArtUtils: ***####** *#* *#* *#* ## ## ## ##* *#* ##* *#* *#* *#* ## 22/04/26 19:02:34 INFO AsciiArtUtils: ****##* *#******###* *****####* ## ## ## ## ## ## ## *#******###* ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* *########### **######### ## ## ## ## ## ## ## *########### ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* *#* *#**** ## ## ## ## ## ## ## ## *#* ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* *#* *#* *## ## *#* *## ## ## ## ## *#* ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* *#** *#* *## ## *#* *## ## ## ## ## *#** ## 22/04/26 19:02:34 INFO AsciiArtUtils: ***** ****#* **#*** **** *#** ***### ## *#*** **### ## ## ## ## **#*** **** ## 22/04/26 19:02:34 INFO AsciiArtUtils: ##########* **######## **######*## ## **######*## ## ## ## ## **######## ## 22/04/26 19:02:34 INFO AsciiArtUtils: ********** ****#**** ******* ## ## ******* ## ## ## ## ## ****#**** ## [INFO] 2022-04-26 19:02:35.298 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[66] - -> 22/04/26 19:02:34 INFO CodeGenerator: Code generated in 172.449391 ms 22/04/26 19:02:34 INFO SharedState: loading hive config file: file:/etc/spark2/3.1.4.0-315/0/hive-site.xml 22/04/26 19:02:34 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('/apps/spark/warehouse'). 22/04/26 19:02:34 INFO SharedState: Warehouse path is '/apps/spark/warehouse'. 22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL. 22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@4571cebe{/SQL,null,AVAILABLE,@Spark} 22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/json. 22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@11cadb32{/SQL/json,null,AVAILABLE,@Spark} 22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/execution. 22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@c82d925{/SQL/execution,null,AVAILABLE,@Spark} 22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/execution/json. 22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@14df5253{/SQL/execution/json,null,AVAILABLE,@Spark} 22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /static/sql. 22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@667a467f{/static/sql,null,AVAILABLE,@Spark} 22/04/26 19:02:34 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint 22/04/26 19:02:34 INFO StreamingQueryManager: Registered listener com.hortonworks.spark.atlas.SparkAtlasStreamingQueryEventTracker [INFO] 2022-04-26 19:02:36.196 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[374] - find app id: application_1644825367082_36847 [INFO] 2022-04-26 19:02:36.197 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[202] - process has exited, execute path:/tmp/dolphinscheduler/exec/process/3755309710720/5316427070080_4/55019/224404, processId:150031 ,exitStatusCode:1 ,processWaitForStatus:true ,processExitValue:1 [INFO] 2022-04-26 19:02:36.299 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[66] - -> 22/04/26 19:02:35 WARN SparkExecutionPlanProcessor: Caught exception during parsing event java.lang.ClassCastException: org.apache.spark.sql.catalyst.plans.logical.AnalysisBarrier cannot be cast to org.apache.spark.sql.catalyst.plans.logical.Project at com.hortonworks.spark.atlas.sql.CommandsHarvester$CreateViewHarvester$.harvest(CommandsHarvester.scala:195) at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor$$anonfun$2.apply(SparkExecutionPlanProcessor.scala:65) at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor$$anonfun$2.apply(SparkExecutionPlanProcessor.scala:54) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104) at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor.process(SparkExecutionPlanProcessor.scala:54) at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor.process(SparkExecutionPlanProcessor.scala:41) at com.hortonworks.spark.atlas.AbstractEventProcessor$$anonfun$eventProcess$1.apply(AbstractEventProcessor.scala:67) at com.hortonworks.spark.atlas.AbstractEventProcessor$$anonfun$eventProcess$1.apply(AbstractEventProcessor.scala:66) at scala.Option.foreach(Option.scala:257) at com.hortonworks.spark.atlas.AbstractEventProcessor.eventProcess(AbstractEventProcessor.scala:66) at com.hortonworks.spark.atlas.AbstractEventProcessor$$anon$1.run(AbstractEventProcessor.scala:39) 22/04/26 19:02:35 ERROR Seatunnel: =============================================================================== 22/04/26 19:02:35 ERROR Seatunnel: Fatal Error, 22/04/26 19:02:35 ERROR Seatunnel: Please submit bug report in https://github.com/apache/incubator-seatunnel/issues 22/04/26 19:02:35 ERROR Seatunnel: Reason:Illegal pattern character 'p' 22/04/26 19:02:35 ERROR Seatunnel: Exception StackTrace:java.lang.IllegalArgumentException: Illegal pattern character 'p' at java.text.SimpleDateFormat.compile(SimpleDateFormat.java:826) at java.text.SimpleDateFormat.initialize(SimpleDateFormat.java:634) at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:605) at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:580) at org.apache.seatunnel.common.utils.StringTemplate.substitute(StringTemplate.java:40) at org.apache.seatunnel.spark.sink.File.output(File.scala:64) at org.apache.seatunnel.spark.sink.File.output(File.scala:34) at org.apache.seatunnel.spark.batch.SparkBatchExecution.sinkProcess(SparkBatchExecution.java:90) at org.apache.seatunnel.spark.batch.SparkBatchExecution.start(SparkBatchExecution.java:105) at org.apache.seatunnel.Seatunnel.entryPoint(Seatunnel.java:107) at org.apache.seatunnel.Seatunnel.run(Seatunnel.java:65) at org.apache.seatunnel.SeatunnelSpark.main(SeatunnelSpark.java:29) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:904) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 22/04/26 19:02:35 ERROR Seatunnel: =============================================================================== Exception in thread "main" java.lang.IllegalArgumentException: Illegal pattern character 'p' at java.text.SimpleDateFormat.compile(SimpleDateFormat.java:826) at java.text.SimpleDateFormat.initialize(SimpleDateFormat.java:634) at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:605) at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:580) at org.apache.seatunnel.common.utils.StringTemplate.substitute(StringTemplate.java:40) at org.apache.seatunnel.spark.sink.File.output(File.scala:64) at org.apache.seatunnel.spark.sink.File.output(File.scala:34) at org.apache.seatunnel.spark.batch.SparkBatchExecution.sinkProcess(SparkBatchExecution.java:90) at org.apache.seatunnel.spark.batch.SparkBatchExecution.start(SparkBatchExecution.java:105) at org.apache.seatunnel.Seatunnel.entryPoint(Seatunnel.java:107) at org.apache.seatunnel.Seatunnel.run(Seatunnel.java:65) at org.apache.seatunnel.SeatunnelSpark.main(SeatunnelSpark.java:29) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:904) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 22/04/26 19:02:35 INFO SparkContext: Invoking stop() from shutdown hook 22/04/26 19:02:35 INFO AbstractConnector: Stopped Spark@a0a9fa5{HTTP/1.1,[http/1.1]}{0.0.0.0:4043} 22/04/26 19:02:35 INFO SparkUI: Stopped Spark web UI at http://client-10-0-161-28:4043 22/04/26 19:02:35 INFO YarnClientSchedulerBackend: Interrupting monitor thread 22/04/26 19:02:35 INFO YarnClientSchedulerBackend: Shutting down all executors 22/04/26 19:02:35 INFO YarnSchedulerBackend$YarnDriverEndpoint: Asking each executor to shut down 22/04/26 19:02:35 INFO SchedulerExtensionServices: Stopping SchedulerExtensionServices (serviceOption=None, services=List(), started=false) 22/04/26 19:02:35 INFO YarnClientSchedulerBackend: Stopped 22/04/26 19:02:35 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 22/04/26 19:02:35 INFO MemoryStore: MemoryStore cleared 22/04/26 19:02:35 INFO BlockManager: BlockManager stopped 22/04/26 19:02:35 INFO BlockManagerMaster: BlockManagerMaster stopped 22/04/26 19:02:35 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 22/04/26 19:02:35 INFO SparkContext: Successfully stopped SparkContext 22/04/26 19:02:35 INFO ShutdownHookManager: Shutdown hook called 22/04/26 19:02:35 INFO ShutdownHookManager: Deleting directory /tmp/spark-a105076b-2107-4c1d-85aa-df3b26674baf 22/04/26 19:02:35 INFO ShutdownHookManager: Deleting directory /tmp/spark-0f34e42e-88f2-48df-bb2b-fb3a2028214e 22/04/26 19:02:35 INFO AtlasHook: ==> Shutdown of Atlas Hook 22/04/26 19:02:35 INFO AtlasHook: <== Shutdown of Atlas Hook [INFO] 2022-04-26 19:02:36.302 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[60] - FINALIZE_SESSION
Flink or Spark Version
spark 2.3.0
Java or Scala Version
No response
Screenshots
No response
Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
Code of Conduct
- [x] I agree to follow this project's Code of Conduct
Search before asking
- [x] I had searched in the issues and found no similar issues.
What happened
I'm using apache-seatunnel-incubating-2.1.0 under hdp 3.1.4 which spark version is 2.3 when I try to save source data to hdfs I got error, my config file is :
env { # You can set spark configuration here # see available properties defined by spark: https://spark.apache.org/docs/latest/configuration.html#available-properties spark.app.name = "SeaTunnel" spark.executor.instances = 2 spark.executor.cores = 1 spark.executor.memory = "1g" } source { # This is a example input plugin **only for test and demonstrate the feature input plugin** Fake { result_table_name = "my_dataset" } } transform { # split data by specific delimiter # you can also use other filter plugins, such as sql # sql { # sql = "select * from accesslog where request_time > 1000" # } # If you would like to get more information about how to configure seatunnel and see full list of filter plugins, # please go to https://seatunnel.apache.org/docs/spark/configuration/transform-plugins/Sql } sink { # choose stdout output plugin to output data to console # Console {} file { path = "hdfs:///tmp/datax/tmp/seatunnel" serializer = "orc" } # you can also use other output plugins, such as hdfs # hdfs { # path = "hdfs://hadoop-cluster-01/nginx/accesslog_processed" # save_mode = "append" # } # If you would like to get more information about how to configure seatunnel and see full list of output plugins, # please go to https://seatunnel.apache.org/docs/spark/configuration/sink-plugins/Console }
error log is :
22/04/26 19:02:33 INFO YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8 22/04/26 19:02:34 INFO BlockManagerMasterEndpoint: Registering block manager worker-10-0-161-23:42231 with 408.9 MB RAM, BlockManagerId(1, worker-10-0-161-23, 42231, None) 22/04/26 19:02:34 INFO AsciiArtUtils: ********* ############## ## 22/04/26 19:02:34 INFO AsciiArtUtils: *######### ############## ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#*** **** ## ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* ## ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* ## ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* ******** ******** ## ## ## ## ******* ## ******* ******** ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* **#######* ########* ## ## ## ##*######** ##*######** **#######* ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#**** **#*** ***** **** ***#* ## ## ## ###******#* ###******#* **#*** ***** ## 22/04/26 19:02:34 INFO AsciiArtUtils: *###***** *#* *#* *#* ## ## ## ##* *#* ##* *#* *#* *#* ## 22/04/26 19:02:34 INFO AsciiArtUtils: ***####** *#* *#* *#* ## ## ## ##* *#* ##* *#* *#* *#* ## 22/04/26 19:02:34 INFO AsciiArtUtils: ****##* *#******###* *****####* ## ## ## ## ## ## ## *#******###* ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* *########### **######### ## ## ## ## ## ## ## *########### ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* *#* *#**** ## ## ## ## ## ## ## ## *#* ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* *#* *#* *## ## *#* *## ## ## ## ## *#* ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* *#** *#* *## ## *#* *## ## ## ## ## *#** ## 22/04/26 19:02:34 INFO AsciiArtUtils: ***** ****#* **#*** **** *#** ***### ## *#*** **### ## ## ## ## **#*** **** ## 22/04/26 19:02:34 INFO AsciiArtUtils: ##########* **######## **######*## ## **######*## ## ## ## ## **######## ## 22/04/26 19:02:34 INFO AsciiArtUtils: ********** ****#**** ******* ## ## ******* ## ## ## ## ## ****#**** ## [INFO] 2022-04-26 19:02:35.298 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[66] - -> 22/04/26 19:02:34 INFO CodeGenerator: Code generated in 172.449391 ms 22/04/26 19:02:34 INFO SharedState: loading hive config file: file:/etc/spark2/3.1.4.0-315/0/hive-site.xml 22/04/26 19:02:34 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('/apps/spark/warehouse'). 22/04/26 19:02:34 INFO SharedState: Warehouse path is '/apps/spark/warehouse'. 22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL. 22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@4571cebe{/SQL,null,AVAILABLE,@Spark} 22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/json. 22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@11cadb32{/SQL/json,null,AVAILABLE,@Spark} 22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/execution. 22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@c82d925{/SQL/execution,null,AVAILABLE,@Spark} 22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/execution/json. 22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@14df5253{/SQL/execution/json,null,AVAILABLE,@Spark} 22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /static/sql. 22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@667a467f{/static/sql,null,AVAILABLE,@Spark} 22/04/26 19:02:34 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint 22/04/26 19:02:34 INFO StreamingQueryManager: Registered listener com.hortonworks.spark.atlas.SparkAtlasStreamingQueryEventTracker [INFO] 2022-04-26 19:02:36.196 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[374] - find app id: application_1644825367082_36847 [INFO] 2022-04-26 19:02:36.197 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[202] - process has exited, execute path:/tmp/dolphinscheduler/exec/process/3755309710720/5316427070080_4/55019/224404, processId:150031 ,exitStatusCode:1 ,processWaitForStatus:true ,processExitValue:1 [INFO] 2022-04-26 19:02:36.299 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[66] - -> 22/04/26 19:02:35 WARN SparkExecutionPlanProcessor: Caught exception during parsing event java.lang.ClassCastException: org.apache.spark.sql.catalyst.plans.logical.AnalysisBarrier cannot be cast to org.apache.spark.sql.catalyst.plans.logical.Project at com.hortonworks.spark.atlas.sql.CommandsHarvester$CreateViewHarvester$.harvest(CommandsHarvester.scala:195) at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor$$anonfun$2.apply(SparkExecutionPlanProcessor.scala:65) at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor$$anonfun$2.apply(SparkExecutionPlanProcessor.scala:54) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104) at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor.process(SparkExecutionPlanProcessor.scala:54) at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor.process(SparkExecutionPlanProcessor.scala:41) at com.hortonworks.spark.atlas.AbstractEventProcessor$$anonfun$eventProcess$1.apply(AbstractEventProcessor.scala:67) at com.hortonworks.spark.atlas.AbstractEventProcessor$$anonfun$eventProcess$1.apply(AbstractEventProcessor.scala:66) at scala.Option.foreach(Option.scala:257) at com.hortonworks.spark.atlas.AbstractEventProcessor.eventProcess(AbstractEventProcessor.scala:66) at com.hortonworks.spark.atlas.AbstractEventProcessor$$anon$1.run(AbstractEventProcessor.scala:39) 22/04/26 19:02:35 ERROR Seatunnel: =============================================================================== 22/04/26 19:02:35 ERROR Seatunnel: Fatal Error, 22/04/26 19:02:35 ERROR Seatunnel: Please submit bug report in https://github.com/apache/incubator-seatunnel/issues 22/04/26 19:02:35 ERROR Seatunnel: Reason:Illegal pattern character 'p' 22/04/26 19:02:35 ERROR Seatunnel: Exception StackTrace:java.lang.IllegalArgumentException: Illegal pattern character 'p' at java.text.SimpleDateFormat.compile(SimpleDateFormat.java:826) at java.text.SimpleDateFormat.initialize(SimpleDateFormat.java:634) at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:605) at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:580) at org.apache.seatunnel.common.utils.StringTemplate.substitute(StringTemplate.java:40) at org.apache.seatunnel.spark.sink.File.output(File.scala:64) at org.apache.seatunnel.spark.sink.File.output(File.scala:34) at org.apache.seatunnel.spark.batch.SparkBatchExecution.sinkProcess(SparkBatchExecution.java:90) at org.apache.seatunnel.spark.batch.SparkBatchExecution.start(SparkBatchExecution.java:105) at org.apache.seatunnel.Seatunnel.entryPoint(Seatunnel.java:107) at org.apache.seatunnel.Seatunnel.run(Seatunnel.java:65) at org.apache.seatunnel.SeatunnelSpark.main(SeatunnelSpark.java:29) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:904) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 22/04/26 19:02:35 ERROR Seatunnel: =============================================================================== Exception in thread "main" java.lang.IllegalArgumentException: Illegal pattern character 'p' at java.text.SimpleDateFormat.compile(SimpleDateFormat.java:826) at java.text.SimpleDateFormat.initialize(SimpleDateFormat.java:634) at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:605) at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:580) at org.apache.seatunnel.common.utils.StringTemplate.substitute(StringTemplate.java:40) at org.apache.seatunnel.spark.sink.File.output(File.scala:64) at org.apache.seatunnel.spark.sink.File.output(File.scala:34) at org.apache.seatunnel.spark.batch.SparkBatchExecution.sinkProcess(SparkBatchExecution.java:90) at org.apache.seatunnel.spark.batch.SparkBatchExecution.start(SparkBatchExecution.java:105) at org.apache.seatunnel.Seatunnel.entryPoint(Seatunnel.java:107) at org.apache.seatunnel.Seatunnel.run(Seatunnel.java:65) at org.apache.seatunnel.SeatunnelSpark.main(SeatunnelSpark.java:29) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:904) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 22/04/26 19:02:35 INFO SparkContext: Invoking stop() from shutdown hook 22/04/26 19:02:35 INFO AbstractConnector: Stopped Spark@a0a9fa5{HTTP/1.1,[http/1.1]}{0.0.0.0:4043} 22/04/26 19:02:35 INFO SparkUI: Stopped Spark web UI at http://client-10-0-161-28:4043 22/04/26 19:02:35 INFO YarnClientSchedulerBackend: Interrupting monitor thread 22/04/26 19:02:35 INFO YarnClientSchedulerBackend: Shutting down all executors 22/04/26 19:02:35 INFO YarnSchedulerBackend$YarnDriverEndpoint: Asking each executor to shut down 22/04/26 19:02:35 INFO SchedulerExtensionServices: Stopping SchedulerExtensionServices (serviceOption=None, services=List(), started=false) 22/04/26 19:02:35 INFO YarnClientSchedulerBackend: Stopped 22/04/26 19:02:35 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 22/04/26 19:02:35 INFO MemoryStore: MemoryStore cleared 22/04/26 19:02:35 INFO BlockManager: BlockManager stopped 22/04/26 19:02:35 INFO BlockManagerMaster: BlockManagerMaster stopped 22/04/26 19:02:35 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 22/04/26 19:02:35 INFO SparkContext: Successfully stopped SparkContext 22/04/26 19:02:35 INFO ShutdownHookManager: Shutdown hook called 22/04/26 19:02:35 INFO ShutdownHookManager: Deleting directory /tmp/spark-a105076b-2107-4c1d-85aa-df3b26674baf 22/04/26 19:02:35 INFO ShutdownHookManager: Deleting directory /tmp/spark-0f34e42e-88f2-48df-bb2b-fb3a2028214e 22/04/26 19:02:35 INFO AtlasHook: ==> Shutdown of Atlas Hook 22/04/26 19:02:35 INFO AtlasHook: <== Shutdown of Atlas Hook [INFO] 2022-04-26 19:02:36.302 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[60] - FINALIZE_SESSION
SeaTunnel Version
2.1.0
SeaTunnel Config
# # Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # ###### ###### This config file is a demonstration of batch processing in SeaTunnel config ###### env { # You can set spark configuration here # see available properties defined by spark: https://spark.apache.org/docs/latest/configuration.html#available-properties spark.app.name = "SeaTunnel" spark.executor.instances = 2 spark.executor.cores = 1 spark.executor.memory = "1g" } source { # This is a example input plugin **only for test and demonstrate the feature input plugin** Fake { result_table_name = "my_dataset" } # You can also use other input plugins, such as file # file { # result_table_name = "accesslog" # path = "hdfs://hadoop-cluster-01/nginx/accesslog" # format = "json" # } # If you would like to get more information about how to configure seatunnel and see full list of input plugins, # please go to https://seatunnel.apache.org/docs/spark/configuration/source-plugins/Fake } transform { # split data by specific delimiter # you can also use other filter plugins, such as sql # sql { # sql = "select * from accesslog where request_time > 1000" # } # If you would like to get more information about how to configure seatunnel and see full list of filter plugins, # please go to https://seatunnel.apache.org/docs/spark/configuration/transform-plugins/Sql } sink { # choose stdout output plugin to output data to console # Console {} file { path = "hdfs:///tmp/datax/tmp/seatunnel" serializer = "orc" } # you can also use other output plugins, such as hdfs # hdfs { # path = "hdfs://hadoop-cluster-01/nginx/accesslog_processed" # save_mode = "append" # } # If you would like to get more information about how to configure seatunnel and see full list of output plugins, # please go to https://seatunnel.apache.org/docs/spark/configuration/sink-plugins/Console }
Running Command
sh /data/soft/seatunnel/seatunnel/bin/start-seatunnel-spark.sh --master yarn --deploy-mode client --config /data/scripts/datax/hdp-doc/scripts/sk/spark.batch.conf
Error Exception
22/04/26 19:02:33 INFO YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8 22/04/26 19:02:34 INFO BlockManagerMasterEndpoint: Registering block manager worker-10-0-161-23:42231 with 408.9 MB RAM, BlockManagerId(1, worker-10-0-161-23, 42231, None) 22/04/26 19:02:34 INFO AsciiArtUtils: ********* ############## ## 22/04/26 19:02:34 INFO AsciiArtUtils: *######### ############## ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#*** **** ## ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* ## ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* ## ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* ******** ******** ## ## ## ## ******* ## ******* ******** ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* **#######* ########* ## ## ## ##*######** ##*######** **#######* ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#**** **#*** ***** **** ***#* ## ## ## ###******#* ###******#* **#*** ***** ## 22/04/26 19:02:34 INFO AsciiArtUtils: *###***** *#* *#* *#* ## ## ## ##* *#* ##* *#* *#* *#* ## 22/04/26 19:02:34 INFO AsciiArtUtils: ***####** *#* *#* *#* ## ## ## ##* *#* ##* *#* *#* *#* ## 22/04/26 19:02:34 INFO AsciiArtUtils: ****##* *#******###* *****####* ## ## ## ## ## ## ## *#******###* ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* *########### **######### ## ## ## ## ## ## ## *########### ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* *#* *#**** ## ## ## ## ## ## ## ## *#* ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* *#* *#* *## ## *#* *## ## ## ## ## *#* ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* *#** *#* *## ## *#* *## ## ## ## ## *#** ## 22/04/26 19:02:34 INFO AsciiArtUtils: ***** ****#* **#*** **** *#** ***### ## *#*** **### ## ## ## ## **#*** **** ## 22/04/26 19:02:34 INFO AsciiArtUtils: ##########* **######## **######*## ## **######*## ## ## ## ## **######## ## 22/04/26 19:02:34 INFO AsciiArtUtils: ********** ****#**** ******* ## ## ******* ## ## ## ## ## ****#**** ## [INFO] 2022-04-26 19:02:35.298 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[66] - -> 22/04/26 19:02:34 INFO CodeGenerator: Code generated in 172.449391 ms 22/04/26 19:02:34 INFO SharedState: loading hive config file: file:/etc/spark2/3.1.4.0-315/0/hive-site.xml 22/04/26 19:02:34 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('/apps/spark/warehouse'). 22/04/26 19:02:34 INFO SharedState: Warehouse path is '/apps/spark/warehouse'. 22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL. 22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@4571cebe{/SQL,null,AVAILABLE,@Spark} 22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/json. 22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@11cadb32{/SQL/json,null,AVAILABLE,@Spark} 22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/execution. 22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@c82d925{/SQL/execution,null,AVAILABLE,@Spark} 22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/execution/json. 22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@14df5253{/SQL/execution/json,null,AVAILABLE,@Spark} 22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /static/sql. 22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@667a467f{/static/sql,null,AVAILABLE,@Spark} 22/04/26 19:02:34 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint 22/04/26 19:02:34 INFO StreamingQueryManager: Registered listener com.hortonworks.spark.atlas.SparkAtlasStreamingQueryEventTracker [INFO] 2022-04-26 19:02:36.196 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[374] - find app id: application_1644825367082_36847 [INFO] 2022-04-26 19:02:36.197 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[202] - process has exited, execute path:/tmp/dolphinscheduler/exec/process/3755309710720/5316427070080_4/55019/224404, processId:150031 ,exitStatusCode:1 ,processWaitForStatus:true ,processExitValue:1 [INFO] 2022-04-26 19:02:36.299 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[66] - -> 22/04/26 19:02:35 WARN SparkExecutionPlanProcessor: Caught exception during parsing event java.lang.ClassCastException: org.apache.spark.sql.catalyst.plans.logical.AnalysisBarrier cannot be cast to org.apache.spark.sql.catalyst.plans.logical.Project at com.hortonworks.spark.atlas.sql.CommandsHarvester$CreateViewHarvester$.harvest(CommandsHarvester.scala:195) at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor$$anonfun$2.apply(SparkExecutionPlanProcessor.scala:65) at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor$$anonfun$2.apply(SparkExecutionPlanProcessor.scala:54) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104) at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor.process(SparkExecutionPlanProcessor.scala:54) at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor.process(SparkExecutionPlanProcessor.scala:41) at com.hortonworks.spark.atlas.AbstractEventProcessor$$anonfun$eventProcess$1.apply(AbstractEventProcessor.scala:67) at com.hortonworks.spark.atlas.AbstractEventProcessor$$anonfun$eventProcess$1.apply(AbstractEventProcessor.scala:66) at scala.Option.foreach(Option.scala:257) at com.hortonworks.spark.atlas.AbstractEventProcessor.eventProcess(AbstractEventProcessor.scala:66) at com.hortonworks.spark.atlas.AbstractEventProcessor$$anon$1.run(AbstractEventProcessor.scala:39) 22/04/26 19:02:35 ERROR Seatunnel: =============================================================================== 22/04/26 19:02:35 ERROR Seatunnel: Fatal Error, 22/04/26 19:02:35 ERROR Seatunnel: Please submit bug report in https://github.com/apache/incubator-seatunnel/issues 22/04/26 19:02:35 ERROR Seatunnel: Reason:Illegal pattern character 'p' 22/04/26 19:02:35 ERROR Seatunnel: Exception StackTrace:java.lang.IllegalArgumentException: Illegal pattern character 'p' at java.text.SimpleDateFormat.compile(SimpleDateFormat.java:826) at java.text.SimpleDateFormat.initialize(SimpleDateFormat.java:634) at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:605) at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:580) at org.apache.seatunnel.common.utils.StringTemplate.substitute(StringTemplate.java:40) at org.apache.seatunnel.spark.sink.File.output(File.scala:64) at org.apache.seatunnel.spark.sink.File.output(File.scala:34) at org.apache.seatunnel.spark.batch.SparkBatchExecution.sinkProcess(SparkBatchExecution.java:90) at org.apache.seatunnel.spark.batch.SparkBatchExecution.start(SparkBatchExecution.java:105) at org.apache.seatunnel.Seatunnel.entryPoint(Seatunnel.java:107) at org.apache.seatunnel.Seatunnel.run(Seatunnel.java:65) at org.apache.seatunnel.SeatunnelSpark.main(SeatunnelSpark.java:29) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:904) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 22/04/26 19:02:35 ERROR Seatunnel: =============================================================================== Exception in thread "main" java.lang.IllegalArgumentException: Illegal pattern character 'p' at java.text.SimpleDateFormat.compile(SimpleDateFormat.java:826) at java.text.SimpleDateFormat.initialize(SimpleDateFormat.java:634) at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:605) at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:580) at org.apache.seatunnel.common.utils.StringTemplate.substitute(StringTemplate.java:40) at org.apache.seatunnel.spark.sink.File.output(File.scala:64) at org.apache.seatunnel.spark.sink.File.output(File.scala:34) at org.apache.seatunnel.spark.batch.SparkBatchExecution.sinkProcess(SparkBatchExecution.java:90) at org.apache.seatunnel.spark.batch.SparkBatchExecution.start(SparkBatchExecution.java:105) at org.apache.seatunnel.Seatunnel.entryPoint(Seatunnel.java:107) at org.apache.seatunnel.Seatunnel.run(Seatunnel.java:65) at org.apache.seatunnel.SeatunnelSpark.main(SeatunnelSpark.java:29) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:904) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 22/04/26 19:02:35 INFO SparkContext: Invoking stop() from shutdown hook 22/04/26 19:02:35 INFO AbstractConnector: Stopped Spark@a0a9fa5{HTTP/1.1,[http/1.1]}{0.0.0.0:4043} 22/04/26 19:02:35 INFO SparkUI: Stopped Spark web UI at http://client-10-0-161-28:4043 22/04/26 19:02:35 INFO YarnClientSchedulerBackend: Interrupting monitor thread 22/04/26 19:02:35 INFO YarnClientSchedulerBackend: Shutting down all executors 22/04/26 19:02:35 INFO YarnSchedulerBackend$YarnDriverEndpoint: Asking each executor to shut down 22/04/26 19:02:35 INFO SchedulerExtensionServices: Stopping SchedulerExtensionServices (serviceOption=None, services=List(), started=false) 22/04/26 19:02:35 INFO YarnClientSchedulerBackend: Stopped 22/04/26 19:02:35 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 22/04/26 19:02:35 INFO MemoryStore: MemoryStore cleared 22/04/26 19:02:35 INFO BlockManager: BlockManager stopped 22/04/26 19:02:35 INFO BlockManagerMaster: BlockManagerMaster stopped 22/04/26 19:02:35 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 22/04/26 19:02:35 INFO SparkContext: Successfully stopped SparkContext 22/04/26 19:02:35 INFO ShutdownHookManager: Shutdown hook called 22/04/26 19:02:35 INFO ShutdownHookManager: Deleting directory /tmp/spark-a105076b-2107-4c1d-85aa-df3b26674baf 22/04/26 19:02:35 INFO ShutdownHookManager: Deleting directory /tmp/spark-0f34e42e-88f2-48df-bb2b-fb3a2028214e 22/04/26 19:02:35 INFO AtlasHook: ==> Shutdown of Atlas Hook 22/04/26 19:02:35 INFO AtlasHook: <== Shutdown of Atlas Hook [INFO] 2022-04-26 19:02:36.302 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[60] - FINALIZE_SESSION
Flink or Spark Version
spark 2.3.0
Java or Scala Version
No response
Screenshots
No response
Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
Code of Conduct
- [x] I agree to follow this project's Code of Conduct
Search before asking
- [x] I had searched in the issues and found no similar issues.
What happened
I'm using apache-seatunnel-incubating-2.1.0 under hdp 3.1.4 which spark version is 2.3 when I try to save source data to hdfs I got error, my config file is :
env { # You can set spark configuration here # see available properties defined by spark: https://spark.apache.org/docs/latest/configuration.html#available-properties spark.app.name = "SeaTunnel" spark.executor.instances = 2 spark.executor.cores = 1 spark.executor.memory = "1g" } source { # This is a example input plugin **only for test and demonstrate the feature input plugin** Fake { result_table_name = "my_dataset" } } transform { # split data by specific delimiter # you can also use other filter plugins, such as sql # sql { # sql = "select * from accesslog where request_time > 1000" # } # If you would like to get more information about how to configure seatunnel and see full list of filter plugins, # please go to https://seatunnel.apache.org/docs/spark/configuration/transform-plugins/Sql } sink { # choose stdout output plugin to output data to console # Console {} file { path = "hdfs:///tmp/datax/tmp/seatunnel" serializer = "orc" } # you can also use other output plugins, such as hdfs # hdfs { # path = "hdfs://hadoop-cluster-01/nginx/accesslog_processed" # save_mode = "append" # } # If you would like to get more information about how to configure seatunnel and see full list of output plugins, # please go to https://seatunnel.apache.org/docs/spark/configuration/sink-plugins/Console }
error log is :
22/04/26 19:02:33 INFO YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8 22/04/26 19:02:34 INFO BlockManagerMasterEndpoint: Registering block manager worker-10-0-161-23:42231 with 408.9 MB RAM, BlockManagerId(1, worker-10-0-161-23, 42231, None) 22/04/26 19:02:34 INFO AsciiArtUtils: ********* ############## ## 22/04/26 19:02:34 INFO AsciiArtUtils: *######### ############## ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#*** **** ## ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* ## ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* ## ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* ******** ******** ## ## ## ## ******* ## ******* ******** ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* **#######* ########* ## ## ## ##*######** ##*######** **#######* ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#**** **#*** ***** **** ***#* ## ## ## ###******#* ###******#* **#*** ***** ## 22/04/26 19:02:34 INFO AsciiArtUtils: *###***** *#* *#* *#* ## ## ## ##* *#* ##* *#* *#* *#* ## 22/04/26 19:02:34 INFO AsciiArtUtils: ***####** *#* *#* *#* ## ## ## ##* *#* ##* *#* *#* *#* ## 22/04/26 19:02:34 INFO AsciiArtUtils: ****##* *#******###* *****####* ## ## ## ## ## ## ## *#******###* ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* *########### **######### ## ## ## ## ## ## ## *########### ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* *#* *#**** ## ## ## ## ## ## ## ## *#* ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* *#* *#* *## ## *#* *## ## ## ## ## *#* ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* *#** *#* *## ## *#* *## ## ## ## ## *#** ## 22/04/26 19:02:34 INFO AsciiArtUtils: ***** ****#* **#*** **** *#** ***### ## *#*** **### ## ## ## ## **#*** **** ## 22/04/26 19:02:34 INFO AsciiArtUtils: ##########* **######## **######*## ## **######*## ## ## ## ## **######## ## 22/04/26 19:02:34 INFO AsciiArtUtils: ********** ****#**** ******* ## ## ******* ## ## ## ## ## ****#**** ## [INFO] 2022-04-26 19:02:35.298 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[66] - -> 22/04/26 19:02:34 INFO CodeGenerator: Code generated in 172.449391 ms 22/04/26 19:02:34 INFO SharedState: loading hive config file: file:/etc/spark2/3.1.4.0-315/0/hive-site.xml 22/04/26 19:02:34 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('/apps/spark/warehouse'). 22/04/26 19:02:34 INFO SharedState: Warehouse path is '/apps/spark/warehouse'. 22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL. 22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@4571cebe{/SQL,null,AVAILABLE,@Spark} 22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/json. 22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@11cadb32{/SQL/json,null,AVAILABLE,@Spark} 22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/execution. 22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@c82d925{/SQL/execution,null,AVAILABLE,@Spark} 22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/execution/json. 22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@14df5253{/SQL/execution/json,null,AVAILABLE,@Spark} 22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /static/sql. 22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@667a467f{/static/sql,null,AVAILABLE,@Spark} 22/04/26 19:02:34 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint 22/04/26 19:02:34 INFO StreamingQueryManager: Registered listener com.hortonworks.spark.atlas.SparkAtlasStreamingQueryEventTracker [INFO] 2022-04-26 19:02:36.196 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[374] - find app id: application_1644825367082_36847 [INFO] 2022-04-26 19:02:36.197 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[202] - process has exited, execute path:/tmp/dolphinscheduler/exec/process/3755309710720/5316427070080_4/55019/224404, processId:150031 ,exitStatusCode:1 ,processWaitForStatus:true ,processExitValue:1 [INFO] 2022-04-26 19:02:36.299 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[66] - -> 22/04/26 19:02:35 WARN SparkExecutionPlanProcessor: Caught exception during parsing event java.lang.ClassCastException: org.apache.spark.sql.catalyst.plans.logical.AnalysisBarrier cannot be cast to org.apache.spark.sql.catalyst.plans.logical.Project at com.hortonworks.spark.atlas.sql.CommandsHarvester$CreateViewHarvester$.harvest(CommandsHarvester.scala:195) at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor$$anonfun$2.apply(SparkExecutionPlanProcessor.scala:65) at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor$$anonfun$2.apply(SparkExecutionPlanProcessor.scala:54) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104) at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor.process(SparkExecutionPlanProcessor.scala:54) at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor.process(SparkExecutionPlanProcessor.scala:41) at com.hortonworks.spark.atlas.AbstractEventProcessor$$anonfun$eventProcess$1.apply(AbstractEventProcessor.scala:67) at com.hortonworks.spark.atlas.AbstractEventProcessor$$anonfun$eventProcess$1.apply(AbstractEventProcessor.scala:66) at scala.Option.foreach(Option.scala:257) at com.hortonworks.spark.atlas.AbstractEventProcessor.eventProcess(AbstractEventProcessor.scala:66) at com.hortonworks.spark.atlas.AbstractEventProcessor$$anon$1.run(AbstractEventProcessor.scala:39) 22/04/26 19:02:35 ERROR Seatunnel: =============================================================================== 22/04/26 19:02:35 ERROR Seatunnel: Fatal Error, 22/04/26 19:02:35 ERROR Seatunnel: Please submit bug report in https://github.com/apache/incubator-seatunnel/issues 22/04/26 19:02:35 ERROR Seatunnel: Reason:Illegal pattern character 'p' 22/04/26 19:02:35 ERROR Seatunnel: Exception StackTrace:java.lang.IllegalArgumentException: Illegal pattern character 'p' at java.text.SimpleDateFormat.compile(SimpleDateFormat.java:826) at java.text.SimpleDateFormat.initialize(SimpleDateFormat.java:634) at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:605) at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:580) at org.apache.seatunnel.common.utils.StringTemplate.substitute(StringTemplate.java:40) at org.apache.seatunnel.spark.sink.File.output(File.scala:64) at org.apache.seatunnel.spark.sink.File.output(File.scala:34) at org.apache.seatunnel.spark.batch.SparkBatchExecution.sinkProcess(SparkBatchExecution.java:90) at org.apache.seatunnel.spark.batch.SparkBatchExecution.start(SparkBatchExecution.java:105) at org.apache.seatunnel.Seatunnel.entryPoint(Seatunnel.java:107) at org.apache.seatunnel.Seatunnel.run(Seatunnel.java:65) at org.apache.seatunnel.SeatunnelSpark.main(SeatunnelSpark.java:29) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:904) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 22/04/26 19:02:35 ERROR Seatunnel: =============================================================================== Exception in thread "main" java.lang.IllegalArgumentException: Illegal pattern character 'p' at java.text.SimpleDateFormat.compile(SimpleDateFormat.java:826) at java.text.SimpleDateFormat.initialize(SimpleDateFormat.java:634) at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:605) at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:580) at org.apache.seatunnel.common.utils.StringTemplate.substitute(StringTemplate.java:40) at org.apache.seatunnel.spark.sink.File.output(File.scala:64) at org.apache.seatunnel.spark.sink.File.output(File.scala:34) at org.apache.seatunnel.spark.batch.SparkBatchExecution.sinkProcess(SparkBatchExecution.java:90) at org.apache.seatunnel.spark.batch.SparkBatchExecution.start(SparkBatchExecution.java:105) at org.apache.seatunnel.Seatunnel.entryPoint(Seatunnel.java:107) at org.apache.seatunnel.Seatunnel.run(Seatunnel.java:65) at org.apache.seatunnel.SeatunnelSpark.main(SeatunnelSpark.java:29) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:904) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 22/04/26 19:02:35 INFO SparkContext: Invoking stop() from shutdown hook 22/04/26 19:02:35 INFO AbstractConnector: Stopped Spark@a0a9fa5{HTTP/1.1,[http/1.1]}{0.0.0.0:4043} 22/04/26 19:02:35 INFO SparkUI: Stopped Spark web UI at http://client-10-0-161-28:4043 22/04/26 19:02:35 INFO YarnClientSchedulerBackend: Interrupting monitor thread 22/04/26 19:02:35 INFO YarnClientSchedulerBackend: Shutting down all executors 22/04/26 19:02:35 INFO YarnSchedulerBackend$YarnDriverEndpoint: Asking each executor to shut down 22/04/26 19:02:35 INFO SchedulerExtensionServices: Stopping SchedulerExtensionServices (serviceOption=None, services=List(), started=false) 22/04/26 19:02:35 INFO YarnClientSchedulerBackend: Stopped 22/04/26 19:02:35 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 22/04/26 19:02:35 INFO MemoryStore: MemoryStore cleared 22/04/26 19:02:35 INFO BlockManager: BlockManager stopped 22/04/26 19:02:35 INFO BlockManagerMaster: BlockManagerMaster stopped 22/04/26 19:02:35 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 22/04/26 19:02:35 INFO SparkContext: Successfully stopped SparkContext 22/04/26 19:02:35 INFO ShutdownHookManager: Shutdown hook called 22/04/26 19:02:35 INFO ShutdownHookManager: Deleting directory /tmp/spark-a105076b-2107-4c1d-85aa-df3b26674baf 22/04/26 19:02:35 INFO ShutdownHookManager: Deleting directory /tmp/spark-0f34e42e-88f2-48df-bb2b-fb3a2028214e 22/04/26 19:02:35 INFO AtlasHook: ==> Shutdown of Atlas Hook 22/04/26 19:02:35 INFO AtlasHook: <== Shutdown of Atlas Hook [INFO] 2022-04-26 19:02:36.302 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[60] - FINALIZE_SESSION
SeaTunnel Version
2.1.0
SeaTunnel Config
# # Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # ###### ###### This config file is a demonstration of batch processing in SeaTunnel config ###### env { # You can set spark configuration here # see available properties defined by spark: https://spark.apache.org/docs/latest/configuration.html#available-properties spark.app.name = "SeaTunnel" spark.executor.instances = 2 spark.executor.cores = 1 spark.executor.memory = "1g" } source { # This is a example input plugin **only for test and demonstrate the feature input plugin** Fake { result_table_name = "my_dataset" } # You can also use other input plugins, such as file # file { # result_table_name = "accesslog" # path = "hdfs://hadoop-cluster-01/nginx/accesslog" # format = "json" # } # If you would like to get more information about how to configure seatunnel and see full list of input plugins, # please go to https://seatunnel.apache.org/docs/spark/configuration/source-plugins/Fake } transform { # split data by specific delimiter # you can also use other filter plugins, such as sql # sql { # sql = "select * from accesslog where request_time > 1000" # } # If you would like to get more information about how to configure seatunnel and see full list of filter plugins, # please go to https://seatunnel.apache.org/docs/spark/configuration/transform-plugins/Sql } sink { # choose stdout output plugin to output data to console # Console {} file { path = "hdfs:///tmp/datax/tmp/seatunnel" serializer = "orc" } # you can also use other output plugins, such as hdfs # hdfs { # path = "hdfs://hadoop-cluster-01/nginx/accesslog_processed" # save_mode = "append" # } # If you would like to get more information about how to configure seatunnel and see full list of output plugins, # please go to https://seatunnel.apache.org/docs/spark/configuration/sink-plugins/Console }
Running Command
sh /data/soft/seatunnel/seatunnel/bin/start-seatunnel-spark.sh --master yarn --deploy-mode client --config /data/scripts/datax/hdp-doc/scripts/sk/spark.batch.conf
Error Exception
22/04/26 19:02:33 INFO YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8 22/04/26 19:02:34 INFO BlockManagerMasterEndpoint: Registering block manager worker-10-0-161-23:42231 with 408.9 MB RAM, BlockManagerId(1, worker-10-0-161-23, 42231, None) 22/04/26 19:02:34 INFO AsciiArtUtils: ********* ############## ## 22/04/26 19:02:34 INFO AsciiArtUtils: *######### ############## ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#*** **** ## ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* ## ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* ## ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* ******** ******** ## ## ## ## ******* ## ******* ******** ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* **#######* ########* ## ## ## ##*######** ##*######** **#######* ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#**** **#*** ***** **** ***#* ## ## ## ###******#* ###******#* **#*** ***** ## 22/04/26 19:02:34 INFO AsciiArtUtils: *###***** *#* *#* *#* ## ## ## ##* *#* ##* *#* *#* *#* ## 22/04/26 19:02:34 INFO AsciiArtUtils: ***####** *#* *#* *#* ## ## ## ##* *#* ##* *#* *#* *#* ## 22/04/26 19:02:34 INFO AsciiArtUtils: ****##* *#******###* *****####* ## ## ## ## ## ## ## *#******###* ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* *########### **######### ## ## ## ## ## ## ## *########### ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* *#* *#**** ## ## ## ## ## ## ## ## *#* ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* *#* *#* *## ## *#* *## ## ## ## ## *#* ## 22/04/26 19:02:34 INFO AsciiArtUtils: *#* *#** *#* *## ## *#* *## ## ## ## ## *#** ## 22/04/26 19:02:34 INFO AsciiArtUtils: ***** ****#* **#*** **** *#** ***### ## *#*** **### ## ## ## ## **#*** **** ## 22/04/26 19:02:34 INFO AsciiArtUtils: ##########* **######## **######*## ## **######*## ## ## ## ## **######## ## 22/04/26 19:02:34 INFO AsciiArtUtils: ********** ****#**** ******* ## ## ******* ## ## ## ## ## ****#**** ## [INFO] 2022-04-26 19:02:35.298 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[66] - -> 22/04/26 19:02:34 INFO CodeGenerator: Code generated in 172.449391 ms 22/04/26 19:02:34 INFO SharedState: loading hive config file: file:/etc/spark2/3.1.4.0-315/0/hive-site.xml 22/04/26 19:02:34 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('/apps/spark/warehouse'). 22/04/26 19:02:34 INFO SharedState: Warehouse path is '/apps/spark/warehouse'. 22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL. 22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@4571cebe{/SQL,null,AVAILABLE,@Spark} 22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/json. 22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@11cadb32{/SQL/json,null,AVAILABLE,@Spark} 22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/execution. 22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@c82d925{/SQL/execution,null,AVAILABLE,@Spark} 22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/execution/json. 22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@14df5253{/SQL/execution/json,null,AVAILABLE,@Spark} 22/04/26 19:02:34 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /static/sql. 22/04/26 19:02:34 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@667a467f{/static/sql,null,AVAILABLE,@Spark} 22/04/26 19:02:34 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint 22/04/26 19:02:34 INFO StreamingQueryManager: Registered listener com.hortonworks.spark.atlas.SparkAtlasStreamingQueryEventTracker [INFO] 2022-04-26 19:02:36.196 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[374] - find app id: application_1644825367082_36847 [INFO] 2022-04-26 19:02:36.197 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[202] - process has exited, execute path:/tmp/dolphinscheduler/exec/process/3755309710720/5316427070080_4/55019/224404, processId:150031 ,exitStatusCode:1 ,processWaitForStatus:true ,processExitValue:1 [INFO] 2022-04-26 19:02:36.299 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[66] - -> 22/04/26 19:02:35 WARN SparkExecutionPlanProcessor: Caught exception during parsing event java.lang.ClassCastException: org.apache.spark.sql.catalyst.plans.logical.AnalysisBarrier cannot be cast to org.apache.spark.sql.catalyst.plans.logical.Project at com.hortonworks.spark.atlas.sql.CommandsHarvester$CreateViewHarvester$.harvest(CommandsHarvester.scala:195) at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor$$anonfun$2.apply(SparkExecutionPlanProcessor.scala:65) at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor$$anonfun$2.apply(SparkExecutionPlanProcessor.scala:54) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104) at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor.process(SparkExecutionPlanProcessor.scala:54) at com.hortonworks.spark.atlas.sql.SparkExecutionPlanProcessor.process(SparkExecutionPlanProcessor.scala:41) at com.hortonworks.spark.atlas.AbstractEventProcessor$$anonfun$eventProcess$1.apply(AbstractEventProcessor.scala:67) at com.hortonworks.spark.atlas.AbstractEventProcessor$$anonfun$eventProcess$1.apply(AbstractEventProcessor.scala:66) at scala.Option.foreach(Option.scala:257) at com.hortonworks.spark.atlas.AbstractEventProcessor.eventProcess(AbstractEventProcessor.scala:66) at com.hortonworks.spark.atlas.AbstractEventProcessor$$anon$1.run(AbstractEventProcessor.scala:39) 22/04/26 19:02:35 ERROR Seatunnel: =============================================================================== 22/04/26 19:02:35 ERROR Seatunnel: Fatal Error, 22/04/26 19:02:35 ERROR Seatunnel: Please submit bug report in https://github.com/apache/incubator-seatunnel/issues 22/04/26 19:02:35 ERROR Seatunnel: Reason:Illegal pattern character 'p' 22/04/26 19:02:35 ERROR Seatunnel: Exception StackTrace:java.lang.IllegalArgumentException: Illegal pattern character 'p' at java.text.SimpleDateFormat.compile(SimpleDateFormat.java:826) at java.text.SimpleDateFormat.initialize(SimpleDateFormat.java:634) at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:605) at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:580) at org.apache.seatunnel.common.utils.StringTemplate.substitute(StringTemplate.java:40) at org.apache.seatunnel.spark.sink.File.output(File.scala:64) at org.apache.seatunnel.spark.sink.File.output(File.scala:34) at org.apache.seatunnel.spark.batch.SparkBatchExecution.sinkProcess(SparkBatchExecution.java:90) at org.apache.seatunnel.spark.batch.SparkBatchExecution.start(SparkBatchExecution.java:105) at org.apache.seatunnel.Seatunnel.entryPoint(Seatunnel.java:107) at org.apache.seatunnel.Seatunnel.run(Seatunnel.java:65) at org.apache.seatunnel.SeatunnelSpark.main(SeatunnelSpark.java:29) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:904) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 22/04/26 19:02:35 ERROR Seatunnel: =============================================================================== Exception in thread "main" java.lang.IllegalArgumentException: Illegal pattern character 'p' at java.text.SimpleDateFormat.compile(SimpleDateFormat.java:826) at java.text.SimpleDateFormat.initialize(SimpleDateFormat.java:634) at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:605) at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:580) at org.apache.seatunnel.common.utils.StringTemplate.substitute(StringTemplate.java:40) at org.apache.seatunnel.spark.sink.File.output(File.scala:64) at org.apache.seatunnel.spark.sink.File.output(File.scala:34) at org.apache.seatunnel.spark.batch.SparkBatchExecution.sinkProcess(SparkBatchExecution.java:90) at org.apache.seatunnel.spark.batch.SparkBatchExecution.start(SparkBatchExecution.java:105) at org.apache.seatunnel.Seatunnel.entryPoint(Seatunnel.java:107) at org.apache.seatunnel.Seatunnel.run(Seatunnel.java:65) at org.apache.seatunnel.SeatunnelSpark.main(SeatunnelSpark.java:29) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:904) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 22/04/26 19:02:35 INFO SparkContext: Invoking stop() from shutdown hook 22/04/26 19:02:35 INFO AbstractConnector: Stopped Spark@a0a9fa5{HTTP/1.1,[http/1.1]}{0.0.0.0:4043} 22/04/26 19:02:35 INFO SparkUI: Stopped Spark web UI at http://client-10-0-161-28:4043 22/04/26 19:02:35 INFO YarnClientSchedulerBackend: Interrupting monitor thread 22/04/26 19:02:35 INFO YarnClientSchedulerBackend: Shutting down all executors 22/04/26 19:02:35 INFO YarnSchedulerBackend$YarnDriverEndpoint: Asking each executor to shut down 22/04/26 19:02:35 INFO SchedulerExtensionServices: Stopping SchedulerExtensionServices (serviceOption=None, services=List(), started=false) 22/04/26 19:02:35 INFO YarnClientSchedulerBackend: Stopped 22/04/26 19:02:35 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 22/04/26 19:02:35 INFO MemoryStore: MemoryStore cleared 22/04/26 19:02:35 INFO BlockManager: BlockManager stopped 22/04/26 19:02:35 INFO BlockManagerMaster: BlockManagerMaster stopped 22/04/26 19:02:35 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 22/04/26 19:02:35 INFO SparkContext: Successfully stopped SparkContext 22/04/26 19:02:35 INFO ShutdownHookManager: Shutdown hook called 22/04/26 19:02:35 INFO ShutdownHookManager: Deleting directory /tmp/spark-a105076b-2107-4c1d-85aa-df3b26674baf 22/04/26 19:02:35 INFO ShutdownHookManager: Deleting directory /tmp/spark-0f34e42e-88f2-48df-bb2b-fb3a2028214e 22/04/26 19:02:35 INFO AtlasHook: ==> Shutdown of Atlas Hook 22/04/26 19:02:35 INFO AtlasHook: <== Shutdown of Atlas Hook [INFO] 2022-04-26 19:02:36.302 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[60] - FINALIZE_SESSION
Flink or Spark Version
spark 2.3.0
Java or Scala Version
No response
Screenshots
No response
Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
Code of Conduct
- [x] I agree to follow this project's Code of Conduct
This error has been fixed in the new version. The repair scheme in version 2.1.0 is as follows
find class org.apache.seatunnel.spark.file.Config, change the code to the following code
final val DEFAULT_TIME_FORMAT = "yyyyMMddHHmmss"
The problem I have is this : Plugin class not found by name :[Hive] hive3.12, spark3.0 ,seatunnel 2.1.2
What's news?
Already fixed.