dolphinscheduler icon indicating copy to clipboard operation
dolphinscheduler copied to clipboard

[Bug][DataQuality] task of data quality run failed

Open zuzuviewer opened this issue 2 years ago • 1 comments

Search before asking

  • [X] I had searched in the issues and found no similar issues.

What happened

[LOG-PATH]: /opt/dolphinscheduler/logs/20220921/6885727510752_21-95-340.log, [HOST]: Host{address='172.21.0.5:1234', ip='172.21.0.5', port=1234} [INFO] 2022-09-21 09:45:07.696 +0000 [taskAppId=TASK-20220921-6885727510752_21-95-340] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[83] - data quality task params {"localParams":[],"resourceList":[],"ruleId":1,"ruleInputParameter":{"check_type":"0","comparison_type":7,"failure_strategy":"0","operator":"5","src_connector_type":0,"src_datasource_id":1,"src_field":"name","src_table":"user","threshold":"0"},"sparkParameters":{"deployMode":"cluster","driverCores":1,"driverMemory":"512M","executorCores":2,"executorMemory":"2G","numExecutors":2,"others":"--conf spark.yarn.maxAppAttempts=1"}} [INFO] 2022-09-21 09:45:07.706 +0000 [taskAppId=TASK-20220921-6885727510752_21-95-340] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[181] - data quality task command: ${SPARK_HOME2}/bin/spark-submit --master yarn --deploy-mode cluster --driver-cores 1 --driver-memory 512M --num-executors 2 --executor-cores 2 --executor-memory 2G --queue default --conf spark.yarn.maxAppAttempts=1 /opt/dolphinscheduler/libs/dolphinscheduler-data-quality-dev-SNAPSHOT.jar "{"name":"$t(null_check)","env":{"type":"batch","config":null},"readers":[{"type":"JDBC","config":{"database":"test","password":"","driver":"com.mysql.cj.jdbc.Driver","user":"root","output_table":"test_user","table":"user","url":"jdbc:mysql://192.168.0.152:3306/test?useUnicode=true&allowPublicKeyRetrival=true&characterEncoding=utf-8&&allowLoadLocalInfile=false&autoDeserialize=false&allowLocalInfile=false&allowUrlInLocalInfile=false"} }],"transformers":[{"type":"sql","config":{"index":1,"output_table":"total_count","sql":"SELECT COUNT() AS total FROM test_user "} },{"type":"sql","config":{"index":2,"output_table":"null_items","sql":"SELECT * FROM test_user WHERE (name is null or name = '') "} },{"type":"sql","config":{"index":3,"output_table":"null_count","sql":"SELECT COUNT() AS nulls FROM null_items"} }],"writers":[{"type":"JDBC","config":{"database":"dolphinscheduler","password":"","driver":"org.postgresql.Driver","user":"root","table":"t_ds_dq_execute_result","url":"jdbc:postgresql://dolphinscheduler-postgresql:5432/dolphinscheduler","sql":"select 0 as rule_type,'$t(null_check)' as rule_name,0 as process_definition_id,95 as process_instance_id,340 as task_instance_id,null_count.nulls AS statistics_value,total_count.total AS comparison_value,7 AS comparison_type,0 as check_type,0 as threshold,5 as operator,0 as failure_strategy,'hdfs://mycluster:8020/user/root/data_quality_error_data/0_95_check-sql' as error_output_path,'2022-09-21 09:45:07' as create_time,'2022-09-21 09:45:07' as update_time from null_count full join total_count"} },{"type":"JDBC","config":{"database":"dolphinscheduler","password":"","driver":"org.postgresql.Driver","user":"root","table":"t_ds_dq_task_statistics_value","url":"jdbc:postgresql://dolphinscheduler-postgresql:5432/dolphinscheduler","sql":"select 0 as process_definition_id,340 as task_instance_id,1 as rule_id,'ZDHVZ+SCMY2SSYPLT/1PITREQ8XN7ECZXCUIEEI3/BG=' as unique_code,'null_count.nulls'AS statistics_name,null_count.nulls AS statistics_value,'2022-09-21 09:45:07' as data_time,'2022-09-21 09:45:07' as create_time,'2022-09-21 09:45:07' as update_time from null_count"} },{"type":"hdfs_file","config":{"path":"hdfs://mycluster:8020/user/root/data_quality_error_data/0_95_check-sql","input_table":"null_items"} }]}" [INFO] 2022-09-21 09:45:07.708 +0000 [taskAppId=TASK-20220921-6885727510752_21-95-340] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[85] - tenantCode user:root, task dir:95_340 [INFO] 2022-09-21 09:45:07.709 +0000 [taskAppId=TASK-20220921-6885727510752_21-95-340] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[90] - create command file:/tmp/dolphinscheduler/exec/process/6885697322976/6885727510752_21/95/340/95_340.command [INFO] 2022-09-21 09:45:07.709 +0000 [taskAppId=TASK-20220921-6885727510752_21-95-340] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[116] - command : #!/bin/sh BASEDIR=$(cd dirname $0; pwd) cd $BASEDIR export HADOOP_HOME=/opt/soft/hadoop-3.3.4 export HADOOP_CONF_DIR=/etc/hadoop/conf export SPARK_HOME2=/opt/soft/spark export SPARK_HOME=/opt/soft/spark export PYTHON_HOME=/opt/soft/python export JAVA_HOME=/usr/local/openjdk-8 export HIVE_HOME=/opt/soft/hive export FLINK_HOME=/opt/soft/flink export DATAX_HOME=/opt/soft/datax export YARN_CONF_DIR=/etc/hadoop/conf export PATH=$HADOOP_HOME/bin:$SPARK_HOME/bin:$PYTHON_HOME/bin:$JAVA_HOME/bin:$HIVE_HOME/bin:$FLINK_HOME/bin:$DATAX_HOME/bin:$PATH export HADOOP_CLASSPATH=hadoop classpath ${SPARK_HOME2}/bin/spark-submit --master yarn --deploy-mode cluster --driver-cores 1 --driver-memory 512M --num-executors 2 --executor-cores 2 --executor-memory 2G --queue default --conf spark.yarn.maxAppAttempts=1 /opt/dolphinscheduler/libs/dolphinscheduler-data-quality-dev-SNAPSHOT.jar "{"name":"$t(null_check)","env":{"type":"batch","config":null},"readers":[{"type":"JDBC","config":{"database":"test","password":"","driver":"com.mysql.cj.jdbc.Driver","user":"root","output_table":"test_user","table":"user","url":"jdbc:mysql://192.168.0.152:3306/test?useUnicode=true&allowPublicKeyRetrival=true&characterEncoding=utf-8&&allowLoadLocalInfile=false&autoDeserialize=false&allowLocalInfile=false&allowUrlInLocalInfile=false"} }],"transformers":[{"type":"sql","config":{"index":1,"output_table":"total_count","sql":"SELECT COUNT() AS total FROM test_user "} },{"type":"sql","config":{"index":2,"output_table":"null_items","sql":"SELECT * FROM test_user WHERE (name is null or name = '') "} },{"type":"sql","config":{"index":3,"output_table":"null_count","sql":"SELECT COUNT() AS nulls FROM null_items"} }],"writers":[{"type":"JDBC","config":{"database":"dolphinscheduler","password":"","driver":"org.postgresql.Driver","user":"root","table":"t_ds_dq_execute_result","url":"jdbc:postgresql://dolphinscheduler-postgresql:5432/dolphinscheduler","sql":"select 0 as rule_type,'$t(null_check)' as rule_name,0 as process_definition_id,95 as process_instance_id,340 as task_instance_id,null_count.nulls AS statistics_value,total_count.total AS comparison_value,7 AS comparison_type,0 as check_type,0 as threshold,5 as operator,0 as failure_strategy,'hdfs://mycluster:8020/user/root/data_quality_error_data/0_95_check-sql' as error_output_path,'2022-09-21 09:45:07' as create_time,'2022-09-21 09:45:07' as update_time from null_count full join total_count"} },{"type":"JDBC","config":{"database":"dolphinscheduler","password":"","driver":"org.postgresql.Driver","user":"root","table":"t_ds_dq_task_statistics_value","url":"jdbc:postgresql://dolphinscheduler-postgresql:5432/dolphinscheduler","sql":"select 0 as process_definition_id,340 as task_instance_id,1 as rule_id,'ZDHVZ+SCMY2SSYPLT/1PITREQ8XN7ECZXCUIEEI3/BG=' as unique_code,'null_count.nulls'AS statistics_name,null_count.nulls AS statistics_value,'2022-09-21 09:45:07' as data_time,'2022-09-21 09:45:07' as create_time,'2022-09-21 09:45:07' as update_time from null_count"} },{"type":"hdfs_file","config":{"path":"hdfs://mycluster:8020/user/root/data_quality_error_data/0_95_check-sql","input_table":"null_items"} }]}" [INFO] 2022-09-21 09:45:07.720 +0000 [taskAppId=TASK-20220921-6885727510752_21-95-340] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[290] - task run command: sudo -u root sh /tmp/dolphinscheduler/exec/process/6885697322976/6885727510752_21/95/340/95_340.command [INFO] 2022-09-21 09:45:07.727 +0000 [taskAppId=TASK-20220921-6885727510752_21-95-340] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[181] - process start, process id is: 9391 [INFO] 2022-09-21 09:45:08.729 +0000 [taskAppId=TASK-20220921-6885727510752_21-95-340] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> WARNING: log4j.properties is not found. HADOOP_CONF_DIR may be incomplete. [INFO] 2022-09-21 09:45:09.731 +0000 [taskAppId=TASK-20220921-6885727510752_21-95-340] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 2022-09-21 17:45:09 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable [INFO] 2022-09-21 09:45:10.732 +0000 [taskAppId=TASK-20220921-6885727510752_21-95-340] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 2022-09-21 17:45:09 INFO RMProxy:98 - Connecting to ResourceManager at /192.168.0.153:8032 2022-09-21 17:45:10 INFO Client:54 - Requesting a new application from cluster with 1 NodeManagers 2022-09-21 17:45:10 INFO Client:54 - Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container) 2022-09-21 17:45:10 INFO Client:54 - Will allocate AM container, with 896 MB memory including 384 MB overhead 2022-09-21 17:45:10 INFO Client:54 - Setting up container launch context for our AM 2022-09-21 17:45:10 INFO Client:54 - Setting up the launch environment for our AM container 2022-09-21 17:45:10 INFO Client:54 - Preparing resources for our AM container 2022-09-21 17:45:10 WARN Client:66 - Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME. [INFO] 2022-09-21 09:45:12.732 +0000 [taskAppId=TASK-20220921-6885727510752_21-95-340] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 2022-09-21 17:45:12 INFO Client:54 - Uploading resource file:/tmp/spark-bef9cb3f-fdfb-4198-9d2c-1a1096fbcf40/__spark_libs__2241309768306226262.zip -> file:/root/.sparkStaging/application_1663722602659_0008/__spark_libs__2241309768306226262.zip [INFO] 2022-09-21 09:45:13.733 +0000 [taskAppId=TASK-20220921-6885727510752_21-95-340] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 2022-09-21 17:45:13 INFO Client:54 - Uploading resource file:/opt/dolphinscheduler/libs/dolphinscheduler-data-quality-dev-SNAPSHOT.jar -> file:/root/.sparkStaging/application_1663722602659_0008/dolphinscheduler-data-quality-dev-SNAPSHOT.jar 2022-09-21 17:45:13 INFO Client:54 - Uploading resource file:/tmp/spark-bef9cb3f-fdfb-4198-9d2c-1a1096fbcf40/__spark_conf__3800795750991638219.zip -> file:/root/.sparkStaging/application_1663722602659_0008/spark_conf.zip 2022-09-21 17:45:13 INFO SecurityManager:54 - Changing view acls to: root 2022-09-21 17:45:13 INFO SecurityManager:54 - Changing modify acls to: root 2022-09-21 17:45:13 INFO SecurityManager:54 - Changing view acls groups to: 2022-09-21 17:45:13 INFO SecurityManager:54 - Changing modify acls groups to: 2022-09-21 17:45:13 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set() [INFO] 2022-09-21 09:45:15.735 +0000 [taskAppId=TASK-20220921-6885727510752_21-95-340] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 2022-09-21 17:45:15 INFO Client:54 - Submitting application application_1663722602659_0008 to ResourceManager 2022-09-21 17:45:15 INFO YarnClientImpl:273 - Submitted application application_1663722602659_0008 [INFO] 2022-09-21 09:45:16.530 +0000 [taskAppId=TASK-20220921-6885727510752_21-95-340] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[375] - find app id: application_1663722602659_0008 [INFO] 2022-09-21 09:45:16.531 +0000 [taskAppId=TASK-20220921-6885727510752_21-95-340] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[205] - process has exited, execute path:/tmp/dolphinscheduler/exec/process/6885697322976/6885727510752_21/95/340, processId:9391 ,exitStatusCode:1 ,processWaitForStatus:true ,processExitValue:1 [INFO] 2022-09-21 09:45:16.737 +0000 [taskAppId=TASK-20220921-6885727510752_21-95-340] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 2022-09-21 17:45:16 INFO Client:54 - Application report for application_1663722602659_0008 (state: FAILED) 2022-09-21 17:45:16 INFO Client:54 - client token: N/A diagnostics: Application application_1663722602659_0008 failed 1 times (global limit =2; local limit is =1) due to AM Container for appattempt_1663722602659_0008_000001 exited with exitCode: -1000 Failing this attempt.Diagnostics: [2022-09-21 17:45:16.207]File file:/root/.sparkStaging/application_1663722602659_0008/__spark_libs__2241309768306226262.zip does not exist java.io.FileNotFoundException: File file:/root/.sparkStaging/application_1663722602659_0008/__spark_libs__2241309768306226262.zip does not exist at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:779) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:1100) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:769) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:462) at org.apache.hadoop.yarn.util.FSDownload.verifyAndCopy(FSDownload.java:271) at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:68) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:415) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:412) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:412) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.doDownloadCall(ContainerLocalizer.java:247) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:240) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:228) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750)

For more detailed output, check the application tracking page: http://bogon:8088/cluster/app/application_1663722602659_0008 Then click on links to logs of each attempt.
. Failing the application.
	 ApplicationMaster host: N/A
	 ApplicationMaster RPC port: -1
	 queue: default
	 start time: 1663753515261
	 final status: FAILED
	 tracking URL: http://bogon:8088/cluster/app/application_1663722602659_0008
	 user: root
2022-09-21 17:45:16 INFO  Client:54 - Deleted staging directory file:/root/.sparkStaging/application_1663722602659_0008
2022-09-21 17:45:16 ERROR Client:70 - Application diagnostics message: Application application_1663722602659_0008 failed 1 times (global limit =2; local limit is =1) due to AM Container for appattempt_1663722602659_0008_000001 exited with  exitCode: -1000
Failing this attempt.Diagnostics: [2022-09-21 17:45:16.207]File file:/root/.sparkStaging/application_1663722602659_0008/__spark_libs__2241309768306226262.zip does not exist
java.io.FileNotFoundException: File file:/root/.sparkStaging/application_1663722602659_0008/__spark_libs__2241309768306226262.zip does not exist
	at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:779)
	at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:1100)
	at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:769)
	at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:462)
	at org.apache.hadoop.yarn.util.FSDownload.verifyAndCopy(FSDownload.java:271)
	at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:68)
	at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:415)
	at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:412)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
	at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:412)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.doDownloadCall(ContainerLocalizer.java:247)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:240)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:228)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)

For more detailed output, check the application tracking page: http://bogon:8088/cluster/app/application_1663722602659_0008 Then click on links to logs of each attempt.
. Failing the application.
Exception in thread "main" org.apache.spark.SparkException: Application application_1663722602659_0008 finished with failed status
	at org.apache.spark.deploy.yarn.Client.run(Client.scala:1149)
	at org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1526)
	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:849)
	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167)
	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:195)
	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:924)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
2022-09-21 17:45:16 INFO  ShutdownHookManager:54 - Shutdown hook called
2022-09-21 17:45:16 INFO  ShutdownHookManager:54 - Deleting directory /tmp/spark-bff797a5-84f1-448f-8960-4c5c69920c22
2022-09-21 17:45:16 INFO  ShutdownHookManager:54 - Deleting directory /tmp/spark-bef9cb3f-fdfb-4198-9d2c-1a1096fbcf40

[INFO] 2022-09-21 09:45:16.741 +0000 [taskAppId=TASK-20220921-6885727510752_21-95-340] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[57] - FINALIZE_SESSION

What you expected to happen

success

How to reproduce

data quality task check null value with mysql data source

Anything else

spark 2.4.0 hadoop 3.3.4

Version

3.0.0

Are you willing to submit PR?

  • [ ] Yes I am willing to submit a PR!

Code of Conduct

zuzuviewer avatar Sep 21 '22 09:09 zuzuviewer

Thank you for your feedback, we have received your issue, Please wait patiently for a reply.

  • In order for us to understand your request as soon as possible, please provide detailed information、version or pictures.
  • If you haven't received a reply for a long time, you can join our slack and send your question to channel #troubleshooting

github-actions[bot] avatar Sep 21 '22 10:09 github-actions[bot]

There are many cases of this kind of error, here you can provide the following detailed reproduction steps.

hstdream avatar Sep 30 '22 06:09 hstdream

There are many cases of this kind of error, here you can provide the following detailed reproduction steps.

just run data quality task, use MySql Source check NULL

zuzuviewer avatar Sep 30 '22 06:09 zuzuviewer

This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.

github-actions[bot] avatar Nov 06 '22 00:11 github-actions[bot]

This issue has been closed because it has not received response for too long time. You could reopen it if you encountered similar problems in the future.

github-actions[bot] avatar Nov 14 '22 00:11 github-actions[bot]