dolphinscheduler icon indicating copy to clipboard operation
dolphinscheduler copied to clipboard

[Bug] [Data Quality] `Error Output Path` doesn't created on HDFS

Open Chris-Arith opened this issue 2 years ago • 7 comments

Search before asking

  • [X] I had searched in the issues and found no similar issues.

What happened

When we check the task result after we successfully executed the data quality task, we found that the Error Output Path has a value and this path can't be found in the HDFS. So, how can I find the output path in HDFS?

image common.properties are configured as below image

  • hdfs is the root user of our HDFS system
  • dolphinscheduler-data-quality-3.0.0-beta-3-SNAPSHOT.jar is packaged from the 3.0.0-beta-2 source code and put into each server's libs directory
  • data-quality.error.output.path is set to /tmp/data-quality-error-data

What you expected to happen

Find the output path in HDFS

How to reproduce

As I describe

Anything else

nil

Version

3.0.0-beta-2

Are you willing to submit PR?

  • [ ] Yes I am willing to submit a PR!

Code of Conduct

Chris-Arith avatar Aug 08 '22 08:08 Chris-Arith

Thank you for your feedback, we have received your issue, Please wait patiently for a reply.

  • In order for us to understand your request as soon as possible, please provide detailed information、version or pictures.
  • If you haven't received a reply for a long time, you can join our slack and send your question to channel #troubleshooting

github-actions[bot] avatar Aug 08 '22 09:08 github-actions[bot]

Please provide the following task execution log, thanks

hyjunhyj avatar Aug 10 '22 02:08 hyjunhyj

Please provide the following task execution log, thanks

task execution log as below

[INFO] 2022-08-08 14:51:08.638 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[83] - data quality task params {"localParams":[],"resourceList":[],"ruleId":10,"ruleInputParameter":{"check_type":"1","comparison_type":1,"comparison_name":"0","failure_strategy":"0","operator":"3","src_connector_type":5,"src_datasource_id":11,"src_field":null,"src_table":"BW_BI0_TSTOR_LOC","threshold":"0"},"sparkParameters":{"deployMode":"cluster","driverCores":1,"driverMemory":"512M","executorCores":2,"executorMemory":"2G","numExecutors":2,"others":"--conf spark.yarn.maxAppAttempts=1"}}
[INFO] 2022-08-08 14:51:08.694 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[181] - data quality task command: ${SPARK_HOME2}/bin/spark-submit --master yarn --deploy-mode cluster --driver-cores 1 --driver-memory 512M --num-executors 2 --executor-cores 2 --executor-memory 2G --queue default --conf spark.yarn.maxAppAttempts=1 /home/dolphinscheduler/dolphinscheduler/worker-server/libs/dolphinscheduler-data-quality-3.0.0-beta-3-SNAPSHOT.jar "{\"name\":\"$t(table_count_check)\",\"env\":{\"type\":\"batch\",\"config\":null},\"readers\":[{\"type\":\"JDBC\",\"config\":{\"database\":\"BDDB\",\"password\":\"*************\",\"driver\":\"oracle.jdbc.OracleDriver\",\"user\":\"BDDB\",\"output_table\":\"BDDB_BW_BI0_TSTOR_LOC\",\"table\":\"BW_BI0_TSTOR_LOC\",\"url\":\"jdbc:oracle:thin:@//10.97.1.230:1521/BDDB\"} }],\"transformers\":[{\"type\":\"sql\",\"config\":{\"index\":1,\"output_table\":\"table_count\",\"sql\":\"SELECT COUNT(*) AS total FROM BDDB_BW_BI0_TSTOR_LOC \"} }],\"writers\":[{\"type\":\"JDBC\",\"config\":{\"database\":\"sl_ds\",\"password\":\"*************\",\"driver\":\"com.mysql.cj.jdbc.Driver\",\"user\":\"root\",\"table\":\"t_ds_dq_execute_result\",\"url\":\"jdbc:mysql://10.97.1.225:3306/sl_ds?useUnicode=true&characterEncoding=UTF-8&useSSL=false&allowLoadLocalInfile=false&autoDeserialize=false&allowLocalInfile=false&allowUrlInLocalInfile=false\",\"sql\":\"select 0 as rule_type,'$t(table_count_check)' as rule_name,0 as process_definition_id,775 as process_instance_id,1896 as task_instance_id,table_count.total AS statistics_value,0 AS comparison_value,1 AS comparison_type,1 as check_type,0 as threshold,3 as operator,0 as failure_strategy,'hdfs://haNameservice:8020/tmp/data-quality-error-data/0_775_chris_data_quality_test' as error_output_path,'2022-08-08 14:51:08' as create_time,'2022-08-08 14:51:08' as update_time from table_count \"} },{\"type\":\"JDBC\",\"config\":{\"database\":\"sl_ds\",\"password\":\"*************\",\"driver\":\"com.mysql.cj.jdbc.Driver\",\"user\":\"root\",\"table\":\"t_ds_dq_task_statistics_value\",\"url\":\"jdbc:mysql://10.97.1.225:3306/sl_ds?useUnicode=true&characterEncoding=UTF-8&useSSL=false&allowLoadLocalInfile=false&autoDeserialize=false&allowLocalInfile=false&allowUrlInLocalInfile=false\",\"sql\":\"select 0 as process_definition_id,1896 as task_instance_id,10 as rule_id,'I+PSCKKFG0Y7KVBI3J8DFQ1CVEDLPYJBINDXQERK7AU=' as unique_code,'table_count.total'AS statistics_name,table_count.total AS statistics_value,'2022-08-08 14:51:08' as data_time,'2022-08-08 14:51:08' as create_time,'2022-08-08 14:51:08' as update_time from table_count\"} }]}"
[INFO] 2022-08-08 14:51:08.696 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[85] - tenantCode user:dolphinscheduler, task dir:775_1896
[INFO] 2022-08-08 14:51:08.696 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[90] - create command file:/tmp/dolphinscheduler/exec/process/6001262888864/6277368089120_4/775/1896/775_1896.command
[INFO] 2022-08-08 14:51:08.696 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[116] - command : #!/bin/sh
BASEDIR=$(cd `dirname $0`; pwd)
cd $BASEDIR
source /home/dolphinscheduler/dolphinscheduler/worker-server/conf/dolphinscheduler_env.sh
${SPARK_HOME2}/bin/spark-submit --master yarn --deploy-mode cluster --driver-cores 1 --driver-memory 512M --num-executors 2 --executor-cores 2 --executor-memory 2G --queue default --conf spark.yarn.maxAppAttempts=1 /home/dolphinscheduler/dolphinscheduler/worker-server/libs/dolphinscheduler-data-quality-3.0.0-beta-3-SNAPSHOT.jar "{\"name\":\"$t(table_count_check)\",\"env\":{\"type\":\"batch\",\"config\":null},\"readers\":[{\"type\":\"JDBC\",\"config\":{\"database\":\"BDDB\",\"password\":\"*************\",\"driver\":\"oracle.jdbc.OracleDriver\",\"user\":\"BDDB\",\"output_table\":\"BDDB_BW_BI0_TSTOR_LOC\",\"table\":\"BW_BI0_TSTOR_LOC\",\"url\":\"jdbc:oracle:thin:@//10.97.1.230:1521/BDDB\"} }],\"transformers\":[{\"type\":\"sql\",\"config\":{\"index\":1,\"output_table\":\"table_count\",\"sql\":\"SELECT COUNT(*) AS total FROM BDDB_BW_BI0_TSTOR_LOC \"} }],\"writers\":[{\"type\":\"JDBC\",\"config\":{\"database\":\"sl_ds\",\"password\":\"*************\",\"driver\":\"com.mysql.cj.jdbc.Driver\",\"user\":\"root\",\"table\":\"t_ds_dq_execute_result\",\"url\":\"jdbc:mysql://10.97.1.225:3306/sl_ds?useUnicode=true&characterEncoding=UTF-8&useSSL=false&allowLoadLocalInfile=false&autoDeserialize=false&allowLocalInfile=false&allowUrlInLocalInfile=false\",\"sql\":\"select 0 as rule_type,'$t(table_count_check)' as rule_name,0 as process_definition_id,775 as process_instance_id,1896 as task_instance_id,table_count.total AS statistics_value,0 AS comparison_value,1 AS comparison_type,1 as check_type,0 as threshold,3 as operator,0 as failure_strategy,'hdfs://haNameservice:8020/tmp/data-quality-error-data/0_775_chris_data_quality_test' as error_output_path,'2022-08-08 14:51:08' as create_time,'2022-08-08 14:51:08' as update_time from table_count \"} },{\"type\":\"JDBC\",\"config\":{\"database\":\"sl_ds\",\"password\":\"*************\",\"driver\":\"com.mysql.cj.jdbc.Driver\",\"user\":\"root\",\"table\":\"t_ds_dq_task_statistics_value\",\"url\":\"jdbc:mysql://10.97.1.225:3306/sl_ds?useUnicode=true&characterEncoding=UTF-8&useSSL=false&allowLoadLocalInfile=false&autoDeserialize=false&allowLocalInfile=false&allowUrlInLocalInfile=false\",\"sql\":\"select 0 as process_definition_id,1896 as task_instance_id,10 as rule_id,'I+PSCKKFG0Y7KVBI3J8DFQ1CVEDLPYJBINDXQERK7AU=' as unique_code,'table_count.total'AS statistics_name,table_count.total AS statistics_value,'2022-08-08 14:51:08' as data_time,'2022-08-08 14:51:08' as create_time,'2022-08-08 14:51:08' as update_time from table_count\"} }]}"
[INFO] 2022-08-08 14:51:08.704 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[290] - task run command: sudo -u dolphinscheduler sh /tmp/dolphinscheduler/exec/process/6001262888864/6277368089120_4/775/1896/775_1896.command
[INFO] 2022-08-08 14:51:08.706 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[181] - process start, process id is: 18551
[INFO] 2022-08-08 14:51:09.706 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> WARNING: User-defined SPARK_HOME (/opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/spark) overrides detected (/opt/cloudera/parcels/CDH/lib/spark).
	WARNING: Running spark-class from user-defined location.
[INFO] 2022-08-08 14:51:10.707 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:10 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm124
	22/08/08 14:51:10 INFO yarn.Client: Requesting a new application from cluster with 3 NodeManagers
	22/08/08 14:51:10 INFO conf.Configuration: resource-types.xml not found
	22/08/08 14:51:10 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
	22/08/08 14:51:10 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (61440 MB per container)
	22/08/08 14:51:10 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
	22/08/08 14:51:10 INFO yarn.Client: Setting up container launch context for our AM
	22/08/08 14:51:10 INFO yarn.Client: Setting up the launch environment for our AM container
	22/08/08 14:51:10 INFO yarn.Client: Preparing resources for our AM container
	22/08/08 14:51:10 INFO yarn.Client: Uploading resource file:/home/dolphinscheduler/dolphinscheduler/worker-server/libs/dolphinscheduler-data-quality-3.0.0-beta-3-SNAPSHOT.jar -> hdfs://haNameservice/user/dolphinscheduler/.sparkStaging/application_1657523889744_0915/dolphinscheduler-data-quality-3.0.0-beta-3-SNAPSHOT.jar
[INFO] 2022-08-08 14:51:11.708 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:11 INFO yarn.Client: Uploading resource file:/tmp/spark-53fcb4bc-b0b4-4495-93ed-ff43dbbf670a/__spark_conf__1110115918348202718.zip -> hdfs://haNameservice/user/dolphinscheduler/.sparkStaging/application_1657523889744_0915/__spark_conf__.zip
	22/08/08 14:51:11 INFO spark.SecurityManager: Changing view acls to: dolphinscheduler
	22/08/08 14:51:11 INFO spark.SecurityManager: Changing modify acls to: dolphinscheduler
	22/08/08 14:51:11 INFO spark.SecurityManager: Changing view acls groups to: 
	22/08/08 14:51:11 INFO spark.SecurityManager: Changing modify acls groups to: 
	22/08/08 14:51:11 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(dolphinscheduler); groups with view permissions: Set(); users  with modify permissions: Set(dolphinscheduler); groups with modify permissions: Set()
	22/08/08 14:51:11 INFO conf.HiveConf: Found configuration file file:/etc/hive/conf.cloudera.hive/hive-site.xml
	22/08/08 14:51:11 INFO security.YARNHadoopDelegationTokenManager: Attempting to load user's ticket cache.
	22/08/08 14:51:11 INFO yarn.Client: Submitting application application_1657523889744_0915 to ResourceManager
[INFO] 2022-08-08 14:51:12.709 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:11 INFO impl.YarnClientImpl: Submitted application application_1657523889744_0915
[INFO] 2022-08-08 14:51:13.710 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:12 INFO yarn.Client: Application report for application_1657523889744_0915 (state: ACCEPTED)
	22/08/08 14:51:12 INFO yarn.Client: 
		 client token: N/A
		 diagnostics: AM container is launched, waiting for AM container to Register with RM
		 ApplicationMaster host: N/A
		 ApplicationMaster RPC port: -1
		 queue: root.users.dolphinscheduler
		 start time: 1659941471620
		 final status: UNDEFINED
		 tracking URL: http://host:8088/proxy/application_1657523889744_0915/
		 user: dolphinscheduler
[INFO] 2022-08-08 14:51:14.711 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:13 INFO yarn.Client: Application report for application_1657523889744_0915 (state: ACCEPTED)
[INFO] 2022-08-08 14:51:15.712 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:14 INFO yarn.Client: Application report for application_1657523889744_0915 (state: ACCEPTED)
[INFO] 2022-08-08 14:51:16.713 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:15 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
	22/08/08 14:51:15 INFO yarn.Client: 
		 client token: N/A
		 diagnostics: N/A
		 ApplicationMaster host: slbdcompute2
		 ApplicationMaster RPC port: 38184
		 queue: root.users.dolphinscheduler
		 start time: 1659941471620
		 final status: UNDEFINED
		 tracking URL: http://host:8088/proxy/application_1657523889744_0915/
		 user: dolphinscheduler
[INFO] 2022-08-08 14:51:17.714 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:16 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
[INFO] 2022-08-08 14:51:18.715 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:17 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
[INFO] 2022-08-08 14:51:19.716 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:18 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
[INFO] 2022-08-08 14:51:20.717 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:19 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
[INFO] 2022-08-08 14:51:21.718 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:20 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
[INFO] 2022-08-08 14:51:22.719 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:21 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
[INFO] 2022-08-08 14:51:23.720 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:22 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
[INFO] 2022-08-08 14:51:24.721 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:23 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
[INFO] 2022-08-08 14:51:25.722 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:24 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
[INFO] 2022-08-08 14:51:26.723 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:25 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
[INFO] 2022-08-08 14:51:27.724 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:26 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
[INFO] 2022-08-08 14:51:28.725 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:27 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
[INFO] 2022-08-08 14:51:29.726 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:28 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
[INFO] 2022-08-08 14:51:30.727 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:29 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
[INFO] 2022-08-08 14:51:31.728 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:30 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
[INFO] 2022-08-08 14:51:32.302 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[375] - find app id: application_1657523889744_0915
[INFO] 2022-08-08 14:51:32.302 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[205] - process has exited, execute path:/tmp/dolphinscheduler/exec/process/6001262888864/6277368089120_4/775/1896, processId:18551 ,exitStatusCode:0 ,processWaitForStatus:true ,processExitValue:0
[INFO] 2022-08-08 14:51:32.729 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/08 14:51:31 INFO yarn.Client: Application report for application_1657523889744_0915 (state: FINISHED)
	22/08/08 14:51:31 INFO yarn.Client: 
		 client token: N/A
		 diagnostics: N/A
		 ApplicationMaster host: slbdcompute2
		 ApplicationMaster RPC port: 38184
		 queue: root.users.dolphinscheduler
		 start time: 1659941471620
		 final status: SUCCEEDED
		 tracking URL: http://host:8088/proxy/application_1657523889744_0915/
		 user: dolphinscheduler
	22/08/08 14:51:31 INFO util.ShutdownHookManager: Shutdown hook called
	22/08/08 14:51:31 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-53fcb4bc-b0b4-4495-93ed-ff43dbbf670a
	22/08/08 14:51:31 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-79b34451-e157-41e4-9a2a-b9fad415244c
[INFO] 2022-08-08 14:51:32.730 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[57] - FINALIZE_SESSION

Chris-Arith avatar Aug 10 '22 02:08 Chris-Arith

You are running in yarn-cluster mode, you need to go to the spark task tracking URL to see the log, or you can change to yarn-client mode to run

hyjunhyj avatar Aug 10 '22 03:08 hyjunhyj

You are running in yarn-cluster mode, you need to go to the spark task tracking URL to see the log, or you can change to yarn-client mode to run

Relating to yarn-client mode, do you mean change the settings here? image I tried the client and local mode, it doesn't create the error output file in HDFS too. Please check logs below.

[INFO] 2022-08-10 13:47:19.317 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[83] - data quality task params {"localParams":[],"resourceList":[],"ruleId":10,"ruleInputParameter":{"check_type":"1","comparison_type":1,"comparison_name":"0","failure_strategy":"0","operator":"3","src_connector_type":5,"src_datasource_id":11,"src_field":null,"src_table":"BW_BI0_TSTOR_LOC","threshold":"0"},"sparkParameters":{"deployMode":"client","driverCores":1,"driverMemory":"512M","executorCores":2,"executorMemory":"2G","numExecutors":2,"others":"--conf spark.yarn.maxAppAttempts=1"}}
[INFO] 2022-08-10 13:47:19.323 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[181] - data quality task command: ${SPARK_HOME2}/bin/spark-submit --master yarn --deploy-mode client --driver-cores 1 --driver-memory 512M --num-executors 2 --executor-cores 2 --executor-memory 2G --queue default --conf spark.yarn.maxAppAttempts=1 /home/dolphinscheduler/dolphinscheduler/worker-server/libs/dolphinscheduler-data-quality-3.0.0-beta-3-SNAPSHOT.jar "{\"name\":\"$t(table_count_check)\",\"env\":{\"type\":\"batch\",\"config\":null},\"readers\":[{\"type\":\"JDBC\",\"config\":{\"database\":\"BDDB\",\"password\":\"*************\",\"driver\":\"oracle.jdbc.OracleDriver\",\"user\":\"BDDB\",\"output_table\":\"BDDB_BW_BI0_TSTOR_LOC\",\"table\":\"BW_BI0_TSTOR_LOC\",\"url\":\"jdbc:oracle:thin:@//10.97.1.230:1521/BDDB\"} }],\"transformers\":[{\"type\":\"sql\",\"config\":{\"index\":1,\"output_table\":\"table_count\",\"sql\":\"SELECT COUNT(*) AS total FROM BDDB_BW_BI0_TSTOR_LOC \"} }],\"writers\":[{\"type\":\"JDBC\",\"config\":{\"database\":\"sl_ds\",\"password\":\"*************\",\"driver\":\"com.mysql.cj.jdbc.Driver\",\"user\":\"root\",\"table\":\"t_ds_dq_execute_result\",\"url\":\"jdbc:mysql://10.97.1.225:3306/sl_ds?useUnicode=true&characterEncoding=UTF-8&useSSL=false&allowLoadLocalInfile=false&autoDeserialize=false&allowLocalInfile=false&allowUrlInLocalInfile=false\",\"sql\":\"select 0 as rule_type,'$t(table_count_check)' as rule_name,0 as process_definition_id,781 as process_instance_id,1944 as task_instance_id,table_count.total AS statistics_value,0 AS comparison_value,1 AS comparison_type,1 as check_type,0 as threshold,3 as operator,0 as failure_strategy,'hdfs://haNameservice:8020/tmp/data-quality-error-data/0_781_chris_data_quality_test' as error_output_path,'2022-08-10 13:47:19' as create_time,'2022-08-10 13:47:19' as update_time from table_count \"} },{\"type\":\"JDBC\",\"config\":{\"database\":\"sl_ds\",\"password\":\"*************\",\"driver\":\"com.mysql.cj.jdbc.Driver\",\"user\":\"root\",\"table\":\"t_ds_dq_task_statistics_value\",\"url\":\"jdbc:mysql://10.97.1.225:3306/sl_ds?useUnicode=true&characterEncoding=UTF-8&useSSL=false&allowLoadLocalInfile=false&autoDeserialize=false&allowLocalInfile=false&allowUrlInLocalInfile=false\",\"sql\":\"select 0 as process_definition_id,1944 as task_instance_id,10 as rule_id,'I+PSCKKFG0Y7KVBI3J8DFQ1CVEDLPYJBINDXQERK7AU=' as unique_code,'table_count.total'AS statistics_name,table_count.total AS statistics_value,'2022-08-10 13:47:19' as data_time,'2022-08-10 13:47:19' as create_time,'2022-08-10 13:47:19' as update_time from table_count\"} }]}"
[INFO] 2022-08-10 13:47:19.323 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[85] - tenantCode user:dolphinscheduler, task dir:781_1944
[INFO] 2022-08-10 13:47:19.323 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[90] - create command file:/tmp/dolphinscheduler/exec/process/6001262888864/6277368089120_7/781/1944/781_1944.command
[INFO] 2022-08-10 13:47:19.323 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[116] - command : #!/bin/sh
BASEDIR=$(cd `dirname $0`; pwd)
cd $BASEDIR
source /home/dolphinscheduler/dolphinscheduler/worker-server/conf/dolphinscheduler_env.sh
${SPARK_HOME2}/bin/spark-submit --master yarn --deploy-mode client --driver-cores 1 --driver-memory 512M --num-executors 2 --executor-cores 2 --executor-memory 2G --queue default --conf spark.yarn.maxAppAttempts=1 /home/dolphinscheduler/dolphinscheduler/worker-server/libs/dolphinscheduler-data-quality-3.0.0-beta-3-SNAPSHOT.jar "{\"name\":\"$t(table_count_check)\",\"env\":{\"type\":\"batch\",\"config\":null},\"readers\":[{\"type\":\"JDBC\",\"config\":{\"database\":\"BDDB\",\"password\":\"*************\",\"driver\":\"oracle.jdbc.OracleDriver\",\"user\":\"BDDB\",\"output_table\":\"BDDB_BW_BI0_TSTOR_LOC\",\"table\":\"BW_BI0_TSTOR_LOC\",\"url\":\"jdbc:oracle:thin:@//10.97.1.230:1521/BDDB\"} }],\"transformers\":[{\"type\":\"sql\",\"config\":{\"index\":1,\"output_table\":\"table_count\",\"sql\":\"SELECT COUNT(*) AS total FROM BDDB_BW_BI0_TSTOR_LOC \"} }],\"writers\":[{\"type\":\"JDBC\",\"config\":{\"database\":\"sl_ds\",\"password\":\"*************\",\"driver\":\"com.mysql.cj.jdbc.Driver\",\"user\":\"root\",\"table\":\"t_ds_dq_execute_result\",\"url\":\"jdbc:mysql://10.97.1.225:3306/sl_ds?useUnicode=true&characterEncoding=UTF-8&useSSL=false&allowLoadLocalInfile=false&autoDeserialize=false&allowLocalInfile=false&allowUrlInLocalInfile=false\",\"sql\":\"select 0 as rule_type,'$t(table_count_check)' as rule_name,0 as process_definition_id,781 as process_instance_id,1944 as task_instance_id,table_count.total AS statistics_value,0 AS comparison_value,1 AS comparison_type,1 as check_type,0 as threshold,3 as operator,0 as failure_strategy,'hdfs://haNameservice:8020/tmp/data-quality-error-data/0_781_chris_data_quality_test' as error_output_path,'2022-08-10 13:47:19' as create_time,'2022-08-10 13:47:19' as update_time from table_count \"} },{\"type\":\"JDBC\",\"config\":{\"database\":\"sl_ds\",\"password\":\"*************\",\"driver\":\"com.mysql.cj.jdbc.Driver\",\"user\":\"root\",\"table\":\"t_ds_dq_task_statistics_value\",\"url\":\"jdbc:mysql://10.97.1.225:3306/sl_ds?useUnicode=true&characterEncoding=UTF-8&useSSL=false&allowLoadLocalInfile=false&autoDeserialize=false&allowLocalInfile=false&allowUrlInLocalInfile=false\",\"sql\":\"select 0 as process_definition_id,1944 as task_instance_id,10 as rule_id,'I+PSCKKFG0Y7KVBI3J8DFQ1CVEDLPYJBINDXQERK7AU=' as unique_code,'table_count.total'AS statistics_name,table_count.total AS statistics_value,'2022-08-10 13:47:19' as data_time,'2022-08-10 13:47:19' as create_time,'2022-08-10 13:47:19' as update_time from table_count\"} }]}"
[INFO] 2022-08-10 13:47:19.325 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[290] - task run command: sudo -u dolphinscheduler sh /tmp/dolphinscheduler/exec/process/6001262888864/6277368089120_7/781/1944/781_1944.command
[INFO] 2022-08-10 13:47:19.326 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[181] - process start, process id is: 19801
[INFO] 2022-08-10 13:47:20.327 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> WARNING: User-defined SPARK_HOME (/opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/spark) overrides detected (/opt/cloudera/parcels/CDH/lib/spark).
	WARNING: Running spark-class from user-defined location.
[INFO] 2022-08-10 13:47:21.328 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/10 13:47:20 INFO spark.SparkContext: Running Spark version 2.4.0-cdh6.3.2
	22/08/10 13:47:20 INFO logging.DriverLogger: Added a local log appender at: /tmp/spark-8a540c14-7e08-499f-9502-cd9c66145346/__driver_logs__/driver.log
	22/08/10 13:47:20 INFO spark.SparkContext: Submitted application: (table_count_check)
	22/08/10 13:47:20 INFO spark.SecurityManager: Changing view acls to: dolphinscheduler
	22/08/10 13:47:20 INFO spark.SecurityManager: Changing modify acls to: dolphinscheduler
	22/08/10 13:47:20 INFO spark.SecurityManager: Changing view acls groups to: 
	22/08/10 13:47:20 INFO spark.SecurityManager: Changing modify acls groups to: 
	22/08/10 13:47:20 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(dolphinscheduler); groups with view permissions: Set(); users  with modify permissions: Set(dolphinscheduler); groups with modify permissions: Set()
	22/08/10 13:47:21 INFO util.Utils: Successfully started service 'sparkDriver' on port 42815.
	22/08/10 13:47:21 INFO spark.SparkEnv: Registering MapOutputTracker
	22/08/10 13:47:21 INFO spark.SparkEnv: Registering BlockManagerMaster
	22/08/10 13:47:21 INFO storage.BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
	22/08/10 13:47:21 INFO storage.BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
	22/08/10 13:47:21 INFO storage.DiskBlockManager: Created local directory at /tmp/blockmgr-9ca3469c-4251-4540-981a-3a13ef674c63
	22/08/10 13:47:21 INFO memory.MemoryStore: MemoryStore started with capacity 93.3 MB
	22/08/10 13:47:21 INFO spark.SparkEnv: Registering OutputCommitCoordinator
	22/08/10 13:47:21 INFO util.log: Logging initialized @1647ms
	22/08/10 13:47:21 INFO server.Server: jetty-9.3.z-SNAPSHOT, build timestamp: 2018-09-05T05:11:46+08:00, git hash: 3ce520221d0240229c862b122d2b06c12a625732
	22/08/10 13:47:21 INFO server.Server: Started @1714ms
	22/08/10 13:47:21 INFO server.AbstractConnector: Started ServerConnector@352e612e{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
	22/08/10 13:47:21 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.
	22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@2f6bcf87{/jobs,null,AVAILABLE,@Spark}
	22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6e92c6ad{/jobs/json,null,AVAILABLE,@Spark}
	22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@2fb5fe30{/jobs/job,null,AVAILABLE,@Spark}
	22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@5baaae4c{/jobs/job/json,null,AVAILABLE,@Spark}
	22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@5b6e8f77{/stages,null,AVAILABLE,@Spark}
	22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@41a6d121{/stages/json,null,AVAILABLE,@Spark}
	22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4f449e8f{/stages/stage,null,AVAILABLE,@Spark}
	22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@19f040ba{/stages/stage/json,null,AVAILABLE,@Spark}
	22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@72ab05ed{/stages/pool,null,AVAILABLE,@Spark}
	22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@27e32fe4{/stages/pool/json,null,AVAILABLE,@Spark}
	22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@c3c4c1c{/storage,null,AVAILABLE,@Spark}
	22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@17d238b1{/storage/json,null,AVAILABLE,@Spark}
	22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3d7cc3cb{/storage/rdd,null,AVAILABLE,@Spark}
	22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@35e478f{/storage/rdd/json,null,AVAILABLE,@Spark}
	22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6d6cb754{/environment,null,AVAILABLE,@Spark}
	22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6b7d1df8{/environment/json,null,AVAILABLE,@Spark}
	22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3044e9c7{/executors,null,AVAILABLE,@Spark}
	22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@41d7b27f{/executors/json,null,AVAILABLE,@Spark}
	22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@49096b06{/executors/threadDump,null,AVAILABLE,@Spark}
	22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4a183d02{/executors/threadDump/json,null,AVAILABLE,@Spark}
	22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@5d05ef57{/static,null,AVAILABLE,@Spark}
	22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@34237b90{/,null,AVAILABLE,@Spark}
	22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1d01dfa5{/api,null,AVAILABLE,@Spark}
	22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@31ff1390{/jobs/job/kill,null,AVAILABLE,@Spark}
	22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@759d81f3{/stages/stage/kill,null,AVAILABLE,@Spark}
	22/08/10 13:47:21 INFO ui.SparkUI: Bound SparkUI to 0.0.0.0, and started at http://slbdcompute3:4040
[INFO] 2022-08-10 13:47:22.329 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/10 13:47:21 INFO spark.SparkContext: Added JAR file:/home/dolphinscheduler/dolphinscheduler/worker-server/libs/dolphinscheduler-data-quality-3.0.0-beta-3-SNAPSHOT.jar at spark://slbdcompute3:42815/jars/dolphinscheduler-data-quality-3.0.0-beta-3-SNAPSHOT.jar with timestamp 1660110441358
	22/08/10 13:47:21 INFO yarn.SparkRackResolver: Got an error when resolving hostNames. Falling back to /default-rack for all
	22/08/10 13:47:21 INFO util.Utils: Using initial executors = 2, max of spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors and spark.executor.instances
	22/08/10 13:47:22 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm124
	22/08/10 13:47:22 INFO yarn.Client: Requesting a new application from cluster with 3 NodeManagers
	22/08/10 13:47:22 INFO conf.Configuration: resource-types.xml not found
	22/08/10 13:47:22 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
	22/08/10 13:47:22 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (61440 MB per container)
	22/08/10 13:47:22 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
	22/08/10 13:47:22 INFO yarn.Client: Setting up container launch context for our AM
	22/08/10 13:47:22 INFO yarn.Client: Setting up the launch environment for our AM container
	22/08/10 13:47:22 INFO yarn.Client: Preparing resources for our AM container
[INFO] 2022-08-10 13:47:23.330 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/10 13:47:22 INFO yarn.SparkRackResolver: Got an error when resolving hostNames. Falling back to /default-rack for all
	22/08/10 13:47:22 INFO yarn.Client: Uploading resource file:/tmp/spark-8a540c14-7e08-499f-9502-cd9c66145346/__spark_conf__1935499196749650246.zip -> hdfs://haNameservice/user/dolphinscheduler/.sparkStaging/application_1657523889744_0961/__spark_conf__.zip
	22/08/10 13:47:22 INFO spark.SecurityManager: Changing view acls to: dolphinscheduler
	22/08/10 13:47:22 INFO spark.SecurityManager: Changing modify acls to: dolphinscheduler
	22/08/10 13:47:22 INFO spark.SecurityManager: Changing view acls groups to: 
	22/08/10 13:47:22 INFO spark.SecurityManager: Changing modify acls groups to: 
	22/08/10 13:47:22 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(dolphinscheduler); groups with view permissions: Set(); users  with modify permissions: Set(dolphinscheduler); groups with modify permissions: Set()
	22/08/10 13:47:22 INFO yarn.Client: Submitting application application_1657523889744_0961 to ResourceManager
	22/08/10 13:47:23 INFO impl.YarnClientImpl: Submitted application application_1657523889744_0961
[INFO] 2022-08-10 13:47:24.331 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/10 13:47:23 INFO yarn.SparkRackResolver: Got an error when resolving hostNames. Falling back to /default-rack for all
	22/08/10 13:47:24 INFO yarn.Client: Application report for application_1657523889744_0961 (state: ACCEPTED)
	22/08/10 13:47:24 INFO yarn.Client: 
		 client token: N/A
		 diagnostics: AM container is launched, waiting for AM container to Register with RM
		 ApplicationMaster host: N/A
		 ApplicationMaster RPC port: -1
		 queue: root.users.dolphinscheduler
		 start time: 1660110442880
		 final status: UNDEFINED
		 tracking URL: http://slbdprimary2:8088/proxy/application_1657523889744_0961/
		 user: dolphinscheduler
[INFO] 2022-08-10 13:47:25.332 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/10 13:47:24 INFO yarn.SparkRackResolver: Got an error when resolving hostNames. Falling back to /default-rack for all
	22/08/10 13:47:25 INFO yarn.Client: Application report for application_1657523889744_0961 (state: ACCEPTED)
[INFO] 2022-08-10 13:47:26.333 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/10 13:47:25 INFO yarn.SparkRackResolver: Got an error when resolving hostNames. Falling back to /default-rack for all
	22/08/10 13:47:25 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> slbdprimary1,slbdprimary2, PROXY_URI_BASES -> http://slbdprimary1:8088/proxy/application_1657523889744_0961,http://slbdprimary2:8088/proxy/application_1657523889744_0961, RM_HA_URLS -> slbdprimary1:8088,slbdprimary2:8088), /proxy/application_1657523889744_0961
	22/08/10 13:47:25 INFO ui.JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /jobs, /jobs/json, /jobs/job, /jobs/job/json, /stages, /stages/json, /stages/stage, /stages/stage/json, /stages/pool, /stages/pool/json, /storage, /storage/json, /storage/rdd, /storage/rdd/json, /environment, /environment/json, /executors, /executors/json, /executors/threadDump, /executors/threadDump/json, /static, /, /api, /jobs/job/kill, /stages/stage/kill.
	22/08/10 13:47:26 INFO yarn.Client: Application report for application_1657523889744_0961 (state: RUNNING)
	22/08/10 13:47:26 INFO yarn.Client: 
		 client token: N/A
		 diagnostics: N/A
		 ApplicationMaster host: 10.97.1.228
		 ApplicationMaster RPC port: -1
		 queue: root.users.dolphinscheduler
		 start time: 1660110442880
		 final status: UNDEFINED
		 tracking URL: http://slbdprimary2:8088/proxy/application_1657523889744_0961/
		 user: dolphinscheduler
	22/08/10 13:47:26 INFO cluster.YarnClientSchedulerBackend: Application application_1657523889744_0961 has started running.
	22/08/10 13:47:26 INFO cluster.SchedulerExtensionServices: Starting Yarn extension services with app application_1657523889744_0961 and attemptId None
	22/08/10 13:47:26 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 45789.
	22/08/10 13:47:26 INFO netty.NettyBlockTransferService: Server created on slbdcompute3:45789
	22/08/10 13:47:26 INFO storage.BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
	22/08/10 13:47:26 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(spark-client://YarnAM)
	22/08/10 13:47:26 INFO storage.BlockManagerMaster: Registering BlockManager BlockManagerId(driver, slbdcompute3, 45789, None)
	22/08/10 13:47:26 INFO storage.BlockManagerMasterEndpoint: Registering block manager slbdcompute3:45789 with 93.3 MB RAM, BlockManagerId(driver, slbdcompute3, 45789, None)
	22/08/10 13:47:26 INFO storage.BlockManagerMaster: Registered BlockManager BlockManagerId(driver, slbdcompute3, 45789, None)
	22/08/10 13:47:26 INFO storage.BlockManager: external shuffle service port = 7337
	22/08/10 13:47:26 INFO storage.BlockManager: Initialized BlockManager: BlockManagerId(driver, slbdcompute3, 45789, None)
	22/08/10 13:47:26 INFO ui.JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /metrics/json.
	22/08/10 13:47:26 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6824b913{/metrics/json,null,AVAILABLE,@Spark}
	22/08/10 13:47:26 INFO scheduler.EventLoggingListener: Logging events to hdfs://haNameservice/user/spark/applicationHistory/application_1657523889744_0961
	22/08/10 13:47:26 INFO util.Utils: Using initial executors = 2, max of spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors and spark.executor.instances
[INFO] 2022-08-10 13:47:27.334 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/10 13:47:26 WARN lineage.LineageWriter: Lineage directory /var/log/spark/lineage doesn't exist or is not writable. Lineage for this application will be disabled.
	22/08/10 13:47:26 INFO util.Utils: Extension com.cloudera.spark.lineage.NavigatorAppListener not being initialized.
	22/08/10 13:47:26 INFO logging.DriverLogger$DfsAsyncWriter: Started driver log file sync to: /user/spark/driverLogs/application_1657523889744_0961_driver.log
	22/08/10 13:47:26 INFO yarn.SparkRackResolver: Got an error when resolving hostNames. Falling back to /default-rack for all
[INFO] 2022-08-10 13:47:28.335 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/10 13:47:27 INFO yarn.SparkRackResolver: Got an error when resolving hostNames. Falling back to /default-rack for all
[INFO] 2022-08-10 13:47:29.336 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/10 13:47:28 INFO yarn.SparkRackResolver: Got an error when resolving hostNames. Falling back to /default-rack for all
	22/08/10 13:47:29 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.97.1.229:59972) with ID 1
	22/08/10 13:47:29 INFO spark.ExecutorAllocationManager: New executor 1 has registered (new total is 1)
	22/08/10 13:47:29 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.97.1.229:59970) with ID 2
	22/08/10 13:47:29 INFO spark.ExecutorAllocationManager: New executor 2 has registered (new total is 2)
	22/08/10 13:47:29 INFO cluster.YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
	22/08/10 13:47:29 INFO storage.BlockManagerMasterEndpoint: Registering block manager slbdcompute3:45518 with 912.3 MB RAM, BlockManagerId(1, slbdcompute3, 45518, None)
	22/08/10 13:47:29 INFO storage.BlockManagerMasterEndpoint: Registering block manager slbdcompute3:37888 with 912.3 MB RAM, BlockManagerId(2, slbdcompute3, 37888, None)
[INFO] 2022-08-10 13:47:30.337 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/10 13:47:29 INFO internal.SharedState: loading hive config file: file:/etc/hive/conf.cloudera.hive/hive-site.xml
	22/08/10 13:47:29 INFO internal.SharedState: spark.sql.warehouse.dir is not set, but hive.metastore.warehouse.dir is set. Setting spark.sql.warehouse.dir to the value of hive.metastore.warehouse.dir ('/user/hive/warehouse').
	22/08/10 13:47:29 INFO internal.SharedState: Warehouse path is '/user/hive/warehouse'.
	22/08/10 13:47:29 INFO ui.JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL.
	22/08/10 13:47:29 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@180b3819{/SQL,null,AVAILABLE,@Spark}
	22/08/10 13:47:29 INFO ui.JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/json.
	22/08/10 13:47:29 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@47272cd3{/SQL/json,null,AVAILABLE,@Spark}
	22/08/10 13:47:29 INFO ui.JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/execution.
	22/08/10 13:47:29 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@707ca986{/SQL/execution,null,AVAILABLE,@Spark}
	22/08/10 13:47:29 INFO ui.JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/execution/json.
	22/08/10 13:47:29 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@183ade54{/SQL/execution/json,null,AVAILABLE,@Spark}
	22/08/10 13:47:29 INFO ui.JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /static/sql.
	22/08/10 13:47:29 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@5e26f1ed{/static/sql,null,AVAILABLE,@Spark}
	22/08/10 13:47:29 INFO state.StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint
	22/08/10 13:47:29 WARN lineage.LineageWriter: Lineage directory /var/log/spark/lineage doesn't exist or is not writable. Lineage for this application will be disabled.
	22/08/10 13:47:29 INFO util.Utils: Extension com.cloudera.spark.lineage.NavigatorQueryListener not being initialized.
[INFO] 2022-08-10 13:47:34.339 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/10 13:47:33 INFO conf.HiveConf: Found configuration file file:/etc/hive/conf.cloudera.hive/hive-site.xml
	22/08/10 13:47:33 INFO hive.HiveUtils: Initializing HiveMetastoreConnection version 2.1 using Spark classes.
	22/08/10 13:47:33 INFO conf.HiveConf: Found configuration file file:/etc/hive/conf.cloudera.hive/hive-site.xml
	22/08/10 13:47:33 INFO session.SessionState: Created HDFS directory: /tmp/hive/dolphinscheduler/8f5ab21b-92bf-47e1-9a19-3e956c8d437e
	22/08/10 13:47:33 INFO session.SessionState: Created local directory: /tmp/dolphinscheduler/8f5ab21b-92bf-47e1-9a19-3e956c8d437e
	22/08/10 13:47:33 INFO session.SessionState: Created HDFS directory: /tmp/hive/dolphinscheduler/8f5ab21b-92bf-47e1-9a19-3e956c8d437e/_tmp_space.db
	22/08/10 13:47:33 INFO client.HiveClientImpl: Warehouse location for Hive client (version 2.1.1) is /user/hive/warehouse
	22/08/10 13:47:34 INFO hive.metastore: HMS client filtering is enabled.
	22/08/10 13:47:34 INFO hive.metastore: Trying to connect to metastore with URI thrift://slbdprimary1:9083
	22/08/10 13:47:34 INFO hive.metastore: Opened a connection to metastore, current connections: 1
	22/08/10 13:47:34 INFO hive.metastore: Connected to metastore.
[INFO] 2022-08-10 13:47:35.340 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/10 13:47:35 INFO codegen.CodeGenerator: Code generated in 170.213585 ms
	22/08/10 13:47:35 INFO codegen.CodeGenerator: Code generated in 9.4946 ms
	22/08/10 13:47:35 INFO spark.SparkContext: Starting job: save at JdbcWriter.java:85
[INFO] 2022-08-10 13:47:36.341 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/10 13:47:35 INFO scheduler.DAGScheduler: Registering RDD 2 (save at JdbcWriter.java:85)
	22/08/10 13:47:35 INFO spark.ContextCleaner: Cleaned accumulator 0
	22/08/10 13:47:35 INFO scheduler.DAGScheduler: Got job 0 (save at JdbcWriter.java:85) with 1 output partitions
	22/08/10 13:47:35 INFO scheduler.DAGScheduler: Final stage: ResultStage 1 (save at JdbcWriter.java:85)
	22/08/10 13:47:35 INFO scheduler.DAGScheduler: Parents of final stage: List(ShuffleMapStage 0)
	22/08/10 13:47:35 INFO scheduler.DAGScheduler: Missing parents: List(ShuffleMapStage 0)
	22/08/10 13:47:35 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 0 (MapPartitionsRDD[2] at save at JdbcWriter.java:85), which has no missing parents
	22/08/10 13:47:35 INFO memory.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 9.4 KB, free 93.3 MB)
	22/08/10 13:47:35 INFO memory.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 5.1 KB, free 93.3 MB)
	22/08/10 13:47:35 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on slbdcompute3:45789 (size: 5.1 KB, free: 93.3 MB)
	22/08/10 13:47:35 INFO spark.SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1164
	22/08/10 13:47:35 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from ShuffleMapStage 0 (MapPartitionsRDD[2] at save at JdbcWriter.java:85) (first 15 tasks are for partitions Vector(0))
	22/08/10 13:47:35 INFO cluster.YarnScheduler: Adding task set 0.0 with 1 tasks
	22/08/10 13:47:35 INFO yarn.SparkRackResolver: Got an error when resolving hostNames. Falling back to /default-rack for all
	22/08/10 13:47:35 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, slbdcompute3, executor 1, partition 0, PROCESS_LOCAL, 7690 bytes)
	22/08/10 13:47:35 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on slbdcompute3:45518 (size: 5.1 KB, free: 912.3 MB)
[INFO] 2022-08-10 13:47:38.342 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/10 13:47:37 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 2315 ms on slbdcompute3 (executor 1) (1/1)
	22/08/10 13:47:37 INFO cluster.YarnScheduler: Removed TaskSet 0.0, whose tasks have all completed, from pool 
	22/08/10 13:47:37 INFO scheduler.DAGScheduler: ShuffleMapStage 0 (save at JdbcWriter.java:85) finished in 2.582 s
	22/08/10 13:47:37 INFO scheduler.DAGScheduler: looking for newly runnable stages
	22/08/10 13:47:37 INFO scheduler.DAGScheduler: running: Set()
	22/08/10 13:47:37 INFO scheduler.DAGScheduler: waiting: Set(ResultStage 1)
	22/08/10 13:47:37 INFO scheduler.DAGScheduler: failed: Set()
	22/08/10 13:47:37 INFO scheduler.DAGScheduler: Submitting ResultStage 1 (MapPartitionsRDD[6] at save at JdbcWriter.java:85), which has no missing parents
	22/08/10 13:47:38 INFO memory.MemoryStore: Block broadcast_1 stored as values in memory (estimated size 22.2 KB, free 93.3 MB)
	22/08/10 13:47:38 INFO memory.MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 10.6 KB, free 93.3 MB)
	22/08/10 13:47:38 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on slbdcompute3:45789 (size: 10.6 KB, free: 93.3 MB)
	22/08/10 13:47:38 INFO spark.SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1164
	22/08/10 13:47:38 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from ResultStage 1 (MapPartitionsRDD[6] at save at JdbcWriter.java:85) (first 15 tasks are for partitions Vector(0))
	22/08/10 13:47:38 INFO cluster.YarnScheduler: Adding task set 1.0 with 1 tasks
	22/08/10 13:47:38 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 1.0 (TID 1, slbdcompute3, executor 2, partition 0, NODE_LOCAL, 7778 bytes)
	22/08/10 13:47:38 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on slbdcompute3:37888 (size: 10.6 KB, free: 912.3 MB)
[INFO] 2022-08-10 13:47:39.343 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/10 13:47:38 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 0 to 10.97.1.229:59970
[INFO] 2022-08-10 13:47:40.344 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/10 13:47:40 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 1.0 (TID 1) in 2222 ms on slbdcompute3 (executor 2) (1/1)
	22/08/10 13:47:40 INFO cluster.YarnScheduler: Removed TaskSet 1.0, whose tasks have all completed, from pool 
	22/08/10 13:47:40 INFO scheduler.DAGScheduler: ResultStage 1 (save at JdbcWriter.java:85) finished in 2.271 s
	22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 27
	22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 35
	22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 36
	22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 40
	22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 31
	22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 34
	22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 16
	22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 26
	22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 20
	22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 17
	22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 25
	22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 21
	22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 19
	22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 15
	22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 37
	22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 30
	22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 24
	22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 28
	22/08/10 13:47:40 INFO scheduler.DAGScheduler: Job 0 finished: save at JdbcWriter.java:85, took 4.922278 s
	22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 33
	22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 29
	22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 32
	22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 18
	22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 22
	22/08/10 13:47:40 INFO storage.BlockManagerInfo: Removed broadcast_0_piece0 on slbdcompute3:45789 in memory (size: 5.1 KB, free: 93.3 MB)
	22/08/10 13:47:40 INFO storage.BlockManagerInfo: Removed broadcast_0_piece0 on slbdcompute3:45518 in memory (size: 5.1 KB, free: 912.3 MB)
	22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 23
	22/08/10 13:47:40 INFO storage.BlockManagerInfo: Removed broadcast_1_piece0 on slbdcompute3:45789 in memory (size: 10.6 KB, free: 93.3 MB)
	22/08/10 13:47:40 INFO storage.BlockManagerInfo: Removed broadcast_1_piece0 on slbdcompute3:37888 in memory (size: 10.6 KB, free: 912.3 MB)
	22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 38
	22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 39
	22/08/10 13:47:40 INFO codegen.CodeGenerator: Code generated in 13.288355 ms
[INFO] 2022-08-10 13:47:41.345 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/10 13:47:40 INFO spark.SparkContext: Starting job: save at JdbcWriter.java:85
	22/08/10 13:47:40 INFO scheduler.DAGScheduler: Registering RDD 10 (save at JdbcWriter.java:85)
	22/08/10 13:47:40 INFO scheduler.DAGScheduler: Got job 1 (save at JdbcWriter.java:85) with 1 output partitions
	22/08/10 13:47:40 INFO scheduler.DAGScheduler: Final stage: ResultStage 3 (save at JdbcWriter.java:85)
	22/08/10 13:47:40 INFO scheduler.DAGScheduler: Parents of final stage: List(ShuffleMapStage 2)
	22/08/10 13:47:40 INFO scheduler.DAGScheduler: Missing parents: List(ShuffleMapStage 2)
	22/08/10 13:47:40 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 2 (MapPartitionsRDD[10] at save at JdbcWriter.java:85), which has no missing parents
	22/08/10 13:47:40 INFO memory.MemoryStore: Block broadcast_2 stored as values in memory (estimated size 9.4 KB, free 93.3 MB)
	22/08/10 13:47:40 INFO memory.MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 5.1 KB, free 93.3 MB)
	22/08/10 13:47:40 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on slbdcompute3:45789 (size: 5.1 KB, free: 93.3 MB)
	22/08/10 13:47:40 INFO spark.SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:1164
	22/08/10 13:47:40 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from ShuffleMapStage 2 (MapPartitionsRDD[10] at save at JdbcWriter.java:85) (first 15 tasks are for partitions Vector(0))
	22/08/10 13:47:40 INFO cluster.YarnScheduler: Adding task set 2.0 with 1 tasks
	22/08/10 13:47:40 INFO yarn.SparkRackResolver: Got an error when resolving hostNames. Falling back to /default-rack for all
	22/08/10 13:47:40 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 2.0 (TID 2, slbdcompute3, executor 1, partition 0, PROCESS_LOCAL, 7690 bytes)
	22/08/10 13:47:40 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on slbdcompute3:45518 (size: 5.1 KB, free: 912.3 MB)
[INFO] 2022-08-10 13:47:42.346 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] -  -> 22/08/10 13:47:41 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 2.0 (TID 2) in 1512 ms on slbdcompute3 (executor 1) (1/1)
	22/08/10 13:47:41 INFO cluster.YarnScheduler: Removed TaskSet 2.0, whose tasks have all completed, from pool 
	22/08/10 13:47:41 INFO scheduler.DAGScheduler: ShuffleMapStage 2 (save at JdbcWriter.java:85) finished in 1.523 s
	22/08/10 13:47:41 INFO scheduler.DAGScheduler: looking for newly runnable stages
	22/08/10 13:47:41 INFO scheduler.DAGScheduler: running: Set()
	22/08/10 13:47:41 INFO scheduler.DAGScheduler: waiting: Set(ResultStage 3)
	22/08/10 13:47:41 INFO scheduler.DAGScheduler: failed: Set()
	22/08/10 13:47:41 INFO scheduler.DAGScheduler: Submitting ResultStage 3 (MapPartitionsRDD[14] at save at JdbcWriter.java:85), which has no missing parents
	22/08/10 13:47:41 INFO memory.MemoryStore: Block broadcast_3 stored as values in memory (estimated size 21.1 KB, free 93.3 MB)
	22/08/10 13:47:41 INFO memory.MemoryStore: Block broadcast_3_piece0 stored as bytes in memory (estimated size 10.2 KB, free 93.3 MB)
	22/08/10 13:47:41 INFO storage.BlockManagerInfo: Added broadcast_3_piece0 in memory on slbdcompute3:45789 (size: 10.2 KB, free: 93.3 MB)
	22/08/10 13:47:41 INFO spark.SparkContext: Created broadcast 3 from broadcast at DAGScheduler.scala:1164
	22/08/10 13:47:41 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from ResultStage 3 (MapPartitionsRDD[14] at save at JdbcWriter.java:85) (first 15 tasks are for partitions Vector(0))
	22/08/10 13:47:41 INFO cluster.YarnScheduler: Adding task set 3.0 with 1 tasks
	22/08/10 13:47:41 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 3.0 (TID 3, slbdcompute3, executor 2, partition 0, NODE_LOCAL, 7778 bytes)
	22/08/10 13:47:41 INFO storage.BlockManagerInfo: Added broadcast_3_piece0 in memory on slbdcompute3:37888 (size: 10.2 KB, free: 912.3 MB)
	22/08/10 13:47:41 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 1 to 10.97.1.229:59970
	22/08/10 13:47:41 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 3.0 (TID 3) in 90 ms on slbdcompute3 (executor 2) (1/1)
	22/08/10 13:47:41 INFO cluster.YarnScheduler: Removed TaskSet 3.0, whose tasks have all completed, from pool 
	22/08/10 13:47:41 INFO scheduler.DAGScheduler: ResultStage 3 (save at JdbcWriter.java:85) finished in 0.105 s
	22/08/10 13:47:41 INFO scheduler.DAGScheduler: Job 1 finished: save at JdbcWriter.java:85, took 1.631793 s
	22/08/10 13:47:42 INFO server.AbstractConnector: Stopped Spark@352e612e{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
	22/08/10 13:47:42 INFO ui.SparkUI: Stopped Spark web UI at http://slbdcompute3:4040
	22/08/10 13:47:42 INFO cluster.YarnClientSchedulerBackend: Interrupting monitor thread
	22/08/10 13:47:42 INFO cluster.YarnClientSchedulerBackend: Shutting down all executors
	22/08/10 13:47:42 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Asking each executor to shut down
	22/08/10 13:47:42 INFO cluster.SchedulerExtensionServices: Stopping SchedulerExtensionServices
	(serviceOption=None,
	 services=List(),
	 started=false)
	22/08/10 13:47:42 INFO cluster.YarnClientSchedulerBackend: Stopped
	22/08/10 13:47:42 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
	22/08/10 13:47:42 INFO memory.MemoryStore: MemoryStore cleared
	22/08/10 13:47:42 INFO storage.BlockManager: BlockManager stopped
	22/08/10 13:47:42 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
	22/08/10 13:47:42 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
	22/08/10 13:47:42 INFO spark.SparkContext: Successfully stopped SparkContext
	22/08/10 13:47:42 INFO util.ShutdownHookManager: Shutdown hook called
	22/08/10 13:47:42 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-8a540c14-7e08-499f-9502-cd9c66145346
	22/08/10 13:47:42 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-7b6d5178-43bb-4647-88d9-fc67b100d784
[INFO] 2022-08-10 13:47:42.469 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[375] - find app id: application_1657523889744_0961
[INFO] 2022-08-10 13:47:42.469 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[205] - process has exited, execute path:/tmp/dolphinscheduler/exec/process/6001262888864/6277368089120_7/781/1944, processId:19801 ,exitStatusCode:0 ,processWaitForStatus:true ,processExitValue:0
[INFO] 2022-08-10 13:47:43.346 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[57] - FINALIZE_SESSION

Task Result snapshot

image

HDFS snapshot

image

Chris-Arith avatar Aug 10 '22 06:08 Chris-Arith

dsdq2

It seems that it is designed like this, "table_count_check" doesn't output error data, compared with "null_check" and found that it is because the value of "errorOutputSql" is different. The value of the "errorOutputSql" variable is "false" when the "table_count_check" rule is used. The value of the "errorOutputSql" variable is "true" when the "null_check" rule is used.

dsdq

You can try to use the "null_check" rule, and then compare the execution commands in the log, where the "writers" list will have an output type of "hdfs_file".

hyjunhyj avatar Aug 10 '22 11:08 hyjunhyj

dsdq2

It seems that it is designed like this, "table_count_check" doesn't output error data, compared with "null_check" and found that it is because the value of "errorOutputSql" is different. The value of the "errorOutputSql" variable is "false" when the "table_count_check" rule is used. The value of the "errorOutputSql" variable is "true" when the "null_check" rule is used.

dsdq

You can try to use the "null_check" rule, and then compare the execution commands in the log, where the "writers" list will have an output type of "hdfs_file".

Thanks, it works. I think maybe it would be more friendly to leave the Error Output Path blank if there is no output file.

Chris-Arith avatar Aug 11 '22 03:08 Chris-Arith

You're welcome. I agree with you, maybe you can create a new issue to discuss this.

hyjunhyj avatar Aug 12 '22 10:08 hyjunhyj

@SbloodyS Should we keep the Error Output Path empty if the task is table_count_check and etc. to eliminate the misunderstanding?

Chris-Arith avatar Aug 15 '22 01:08 Chris-Arith

This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.

github-actions[bot] avatar Oct 06 '22 00:10 github-actions[bot]