dolphinscheduler
dolphinscheduler copied to clipboard
[Bug] [Data Quality] `Error Output Path` doesn't created on HDFS
Search before asking
- [X] I had searched in the issues and found no similar issues.
What happened
When we check the task result after we successfully executed the data quality task, we found that the Error Output Path
has a value and this path can't be found in the HDFS.
So, how can I find the output path in HDFS?
common.properties are configured as below
-
hdfs
is the root user of our HDFS system -
dolphinscheduler-data-quality-3.0.0-beta-3-SNAPSHOT.jar
is packaged from the 3.0.0-beta-2 source code and put into each server'slibs
directory -
data-quality.error.output.path
is set to/tmp/data-quality-error-data
What you expected to happen
Find the output path in HDFS
How to reproduce
As I describe
Anything else
nil
Version
3.0.0-beta-2
Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
Thank you for your feedback, we have received your issue, Please wait patiently for a reply.
- In order for us to understand your request as soon as possible, please provide detailed information、version or pictures.
- If you haven't received a reply for a long time, you can join our slack and send your question to channel
#troubleshooting
Please provide the following task execution log, thanks
Please provide the following task execution log, thanks
task execution log as below
[INFO] 2022-08-08 14:51:08.638 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[83] - data quality task params {"localParams":[],"resourceList":[],"ruleId":10,"ruleInputParameter":{"check_type":"1","comparison_type":1,"comparison_name":"0","failure_strategy":"0","operator":"3","src_connector_type":5,"src_datasource_id":11,"src_field":null,"src_table":"BW_BI0_TSTOR_LOC","threshold":"0"},"sparkParameters":{"deployMode":"cluster","driverCores":1,"driverMemory":"512M","executorCores":2,"executorMemory":"2G","numExecutors":2,"others":"--conf spark.yarn.maxAppAttempts=1"}}
[INFO] 2022-08-08 14:51:08.694 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[181] - data quality task command: ${SPARK_HOME2}/bin/spark-submit --master yarn --deploy-mode cluster --driver-cores 1 --driver-memory 512M --num-executors 2 --executor-cores 2 --executor-memory 2G --queue default --conf spark.yarn.maxAppAttempts=1 /home/dolphinscheduler/dolphinscheduler/worker-server/libs/dolphinscheduler-data-quality-3.0.0-beta-3-SNAPSHOT.jar "{\"name\":\"$t(table_count_check)\",\"env\":{\"type\":\"batch\",\"config\":null},\"readers\":[{\"type\":\"JDBC\",\"config\":{\"database\":\"BDDB\",\"password\":\"*************\",\"driver\":\"oracle.jdbc.OracleDriver\",\"user\":\"BDDB\",\"output_table\":\"BDDB_BW_BI0_TSTOR_LOC\",\"table\":\"BW_BI0_TSTOR_LOC\",\"url\":\"jdbc:oracle:thin:@//10.97.1.230:1521/BDDB\"} }],\"transformers\":[{\"type\":\"sql\",\"config\":{\"index\":1,\"output_table\":\"table_count\",\"sql\":\"SELECT COUNT(*) AS total FROM BDDB_BW_BI0_TSTOR_LOC \"} }],\"writers\":[{\"type\":\"JDBC\",\"config\":{\"database\":\"sl_ds\",\"password\":\"*************\",\"driver\":\"com.mysql.cj.jdbc.Driver\",\"user\":\"root\",\"table\":\"t_ds_dq_execute_result\",\"url\":\"jdbc:mysql://10.97.1.225:3306/sl_ds?useUnicode=true&characterEncoding=UTF-8&useSSL=false&allowLoadLocalInfile=false&autoDeserialize=false&allowLocalInfile=false&allowUrlInLocalInfile=false\",\"sql\":\"select 0 as rule_type,'$t(table_count_check)' as rule_name,0 as process_definition_id,775 as process_instance_id,1896 as task_instance_id,table_count.total AS statistics_value,0 AS comparison_value,1 AS comparison_type,1 as check_type,0 as threshold,3 as operator,0 as failure_strategy,'hdfs://haNameservice:8020/tmp/data-quality-error-data/0_775_chris_data_quality_test' as error_output_path,'2022-08-08 14:51:08' as create_time,'2022-08-08 14:51:08' as update_time from table_count \"} },{\"type\":\"JDBC\",\"config\":{\"database\":\"sl_ds\",\"password\":\"*************\",\"driver\":\"com.mysql.cj.jdbc.Driver\",\"user\":\"root\",\"table\":\"t_ds_dq_task_statistics_value\",\"url\":\"jdbc:mysql://10.97.1.225:3306/sl_ds?useUnicode=true&characterEncoding=UTF-8&useSSL=false&allowLoadLocalInfile=false&autoDeserialize=false&allowLocalInfile=false&allowUrlInLocalInfile=false\",\"sql\":\"select 0 as process_definition_id,1896 as task_instance_id,10 as rule_id,'I+PSCKKFG0Y7KVBI3J8DFQ1CVEDLPYJBINDXQERK7AU=' as unique_code,'table_count.total'AS statistics_name,table_count.total AS statistics_value,'2022-08-08 14:51:08' as data_time,'2022-08-08 14:51:08' as create_time,'2022-08-08 14:51:08' as update_time from table_count\"} }]}"
[INFO] 2022-08-08 14:51:08.696 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[85] - tenantCode user:dolphinscheduler, task dir:775_1896
[INFO] 2022-08-08 14:51:08.696 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[90] - create command file:/tmp/dolphinscheduler/exec/process/6001262888864/6277368089120_4/775/1896/775_1896.command
[INFO] 2022-08-08 14:51:08.696 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[116] - command : #!/bin/sh
BASEDIR=$(cd `dirname $0`; pwd)
cd $BASEDIR
source /home/dolphinscheduler/dolphinscheduler/worker-server/conf/dolphinscheduler_env.sh
${SPARK_HOME2}/bin/spark-submit --master yarn --deploy-mode cluster --driver-cores 1 --driver-memory 512M --num-executors 2 --executor-cores 2 --executor-memory 2G --queue default --conf spark.yarn.maxAppAttempts=1 /home/dolphinscheduler/dolphinscheduler/worker-server/libs/dolphinscheduler-data-quality-3.0.0-beta-3-SNAPSHOT.jar "{\"name\":\"$t(table_count_check)\",\"env\":{\"type\":\"batch\",\"config\":null},\"readers\":[{\"type\":\"JDBC\",\"config\":{\"database\":\"BDDB\",\"password\":\"*************\",\"driver\":\"oracle.jdbc.OracleDriver\",\"user\":\"BDDB\",\"output_table\":\"BDDB_BW_BI0_TSTOR_LOC\",\"table\":\"BW_BI0_TSTOR_LOC\",\"url\":\"jdbc:oracle:thin:@//10.97.1.230:1521/BDDB\"} }],\"transformers\":[{\"type\":\"sql\",\"config\":{\"index\":1,\"output_table\":\"table_count\",\"sql\":\"SELECT COUNT(*) AS total FROM BDDB_BW_BI0_TSTOR_LOC \"} }],\"writers\":[{\"type\":\"JDBC\",\"config\":{\"database\":\"sl_ds\",\"password\":\"*************\",\"driver\":\"com.mysql.cj.jdbc.Driver\",\"user\":\"root\",\"table\":\"t_ds_dq_execute_result\",\"url\":\"jdbc:mysql://10.97.1.225:3306/sl_ds?useUnicode=true&characterEncoding=UTF-8&useSSL=false&allowLoadLocalInfile=false&autoDeserialize=false&allowLocalInfile=false&allowUrlInLocalInfile=false\",\"sql\":\"select 0 as rule_type,'$t(table_count_check)' as rule_name,0 as process_definition_id,775 as process_instance_id,1896 as task_instance_id,table_count.total AS statistics_value,0 AS comparison_value,1 AS comparison_type,1 as check_type,0 as threshold,3 as operator,0 as failure_strategy,'hdfs://haNameservice:8020/tmp/data-quality-error-data/0_775_chris_data_quality_test' as error_output_path,'2022-08-08 14:51:08' as create_time,'2022-08-08 14:51:08' as update_time from table_count \"} },{\"type\":\"JDBC\",\"config\":{\"database\":\"sl_ds\",\"password\":\"*************\",\"driver\":\"com.mysql.cj.jdbc.Driver\",\"user\":\"root\",\"table\":\"t_ds_dq_task_statistics_value\",\"url\":\"jdbc:mysql://10.97.1.225:3306/sl_ds?useUnicode=true&characterEncoding=UTF-8&useSSL=false&allowLoadLocalInfile=false&autoDeserialize=false&allowLocalInfile=false&allowUrlInLocalInfile=false\",\"sql\":\"select 0 as process_definition_id,1896 as task_instance_id,10 as rule_id,'I+PSCKKFG0Y7KVBI3J8DFQ1CVEDLPYJBINDXQERK7AU=' as unique_code,'table_count.total'AS statistics_name,table_count.total AS statistics_value,'2022-08-08 14:51:08' as data_time,'2022-08-08 14:51:08' as create_time,'2022-08-08 14:51:08' as update_time from table_count\"} }]}"
[INFO] 2022-08-08 14:51:08.704 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[290] - task run command: sudo -u dolphinscheduler sh /tmp/dolphinscheduler/exec/process/6001262888864/6277368089120_4/775/1896/775_1896.command
[INFO] 2022-08-08 14:51:08.706 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[181] - process start, process id is: 18551
[INFO] 2022-08-08 14:51:09.706 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> WARNING: User-defined SPARK_HOME (/opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/spark) overrides detected (/opt/cloudera/parcels/CDH/lib/spark).
WARNING: Running spark-class from user-defined location.
[INFO] 2022-08-08 14:51:10.707 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/08 14:51:10 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm124
22/08/08 14:51:10 INFO yarn.Client: Requesting a new application from cluster with 3 NodeManagers
22/08/08 14:51:10 INFO conf.Configuration: resource-types.xml not found
22/08/08 14:51:10 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
22/08/08 14:51:10 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (61440 MB per container)
22/08/08 14:51:10 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
22/08/08 14:51:10 INFO yarn.Client: Setting up container launch context for our AM
22/08/08 14:51:10 INFO yarn.Client: Setting up the launch environment for our AM container
22/08/08 14:51:10 INFO yarn.Client: Preparing resources for our AM container
22/08/08 14:51:10 INFO yarn.Client: Uploading resource file:/home/dolphinscheduler/dolphinscheduler/worker-server/libs/dolphinscheduler-data-quality-3.0.0-beta-3-SNAPSHOT.jar -> hdfs://haNameservice/user/dolphinscheduler/.sparkStaging/application_1657523889744_0915/dolphinscheduler-data-quality-3.0.0-beta-3-SNAPSHOT.jar
[INFO] 2022-08-08 14:51:11.708 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/08 14:51:11 INFO yarn.Client: Uploading resource file:/tmp/spark-53fcb4bc-b0b4-4495-93ed-ff43dbbf670a/__spark_conf__1110115918348202718.zip -> hdfs://haNameservice/user/dolphinscheduler/.sparkStaging/application_1657523889744_0915/__spark_conf__.zip
22/08/08 14:51:11 INFO spark.SecurityManager: Changing view acls to: dolphinscheduler
22/08/08 14:51:11 INFO spark.SecurityManager: Changing modify acls to: dolphinscheduler
22/08/08 14:51:11 INFO spark.SecurityManager: Changing view acls groups to:
22/08/08 14:51:11 INFO spark.SecurityManager: Changing modify acls groups to:
22/08/08 14:51:11 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(dolphinscheduler); groups with view permissions: Set(); users with modify permissions: Set(dolphinscheduler); groups with modify permissions: Set()
22/08/08 14:51:11 INFO conf.HiveConf: Found configuration file file:/etc/hive/conf.cloudera.hive/hive-site.xml
22/08/08 14:51:11 INFO security.YARNHadoopDelegationTokenManager: Attempting to load user's ticket cache.
22/08/08 14:51:11 INFO yarn.Client: Submitting application application_1657523889744_0915 to ResourceManager
[INFO] 2022-08-08 14:51:12.709 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/08 14:51:11 INFO impl.YarnClientImpl: Submitted application application_1657523889744_0915
[INFO] 2022-08-08 14:51:13.710 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/08 14:51:12 INFO yarn.Client: Application report for application_1657523889744_0915 (state: ACCEPTED)
22/08/08 14:51:12 INFO yarn.Client:
client token: N/A
diagnostics: AM container is launched, waiting for AM container to Register with RM
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: root.users.dolphinscheduler
start time: 1659941471620
final status: UNDEFINED
tracking URL: http://host:8088/proxy/application_1657523889744_0915/
user: dolphinscheduler
[INFO] 2022-08-08 14:51:14.711 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/08 14:51:13 INFO yarn.Client: Application report for application_1657523889744_0915 (state: ACCEPTED)
[INFO] 2022-08-08 14:51:15.712 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/08 14:51:14 INFO yarn.Client: Application report for application_1657523889744_0915 (state: ACCEPTED)
[INFO] 2022-08-08 14:51:16.713 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/08 14:51:15 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
22/08/08 14:51:15 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: slbdcompute2
ApplicationMaster RPC port: 38184
queue: root.users.dolphinscheduler
start time: 1659941471620
final status: UNDEFINED
tracking URL: http://host:8088/proxy/application_1657523889744_0915/
user: dolphinscheduler
[INFO] 2022-08-08 14:51:17.714 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/08 14:51:16 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
[INFO] 2022-08-08 14:51:18.715 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/08 14:51:17 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
[INFO] 2022-08-08 14:51:19.716 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/08 14:51:18 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
[INFO] 2022-08-08 14:51:20.717 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/08 14:51:19 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
[INFO] 2022-08-08 14:51:21.718 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/08 14:51:20 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
[INFO] 2022-08-08 14:51:22.719 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/08 14:51:21 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
[INFO] 2022-08-08 14:51:23.720 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/08 14:51:22 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
[INFO] 2022-08-08 14:51:24.721 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/08 14:51:23 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
[INFO] 2022-08-08 14:51:25.722 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/08 14:51:24 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
[INFO] 2022-08-08 14:51:26.723 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/08 14:51:25 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
[INFO] 2022-08-08 14:51:27.724 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/08 14:51:26 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
[INFO] 2022-08-08 14:51:28.725 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/08 14:51:27 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
[INFO] 2022-08-08 14:51:29.726 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/08 14:51:28 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
[INFO] 2022-08-08 14:51:30.727 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/08 14:51:29 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
[INFO] 2022-08-08 14:51:31.728 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/08 14:51:30 INFO yarn.Client: Application report for application_1657523889744_0915 (state: RUNNING)
[INFO] 2022-08-08 14:51:32.302 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[375] - find app id: application_1657523889744_0915
[INFO] 2022-08-08 14:51:32.302 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[205] - process has exited, execute path:/tmp/dolphinscheduler/exec/process/6001262888864/6277368089120_4/775/1896, processId:18551 ,exitStatusCode:0 ,processWaitForStatus:true ,processExitValue:0
[INFO] 2022-08-08 14:51:32.729 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/08 14:51:31 INFO yarn.Client: Application report for application_1657523889744_0915 (state: FINISHED)
22/08/08 14:51:31 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: slbdcompute2
ApplicationMaster RPC port: 38184
queue: root.users.dolphinscheduler
start time: 1659941471620
final status: SUCCEEDED
tracking URL: http://host:8088/proxy/application_1657523889744_0915/
user: dolphinscheduler
22/08/08 14:51:31 INFO util.ShutdownHookManager: Shutdown hook called
22/08/08 14:51:31 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-53fcb4bc-b0b4-4495-93ed-ff43dbbf670a
22/08/08 14:51:31 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-79b34451-e157-41e4-9a2a-b9fad415244c
[INFO] 2022-08-08 14:51:32.730 +0800 [taskAppId=TASK-20220808-6277368089120_4-775-1896] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[57] - FINALIZE_SESSION
You are running in yarn-cluster mode, you need to go to the spark task tracking URL to see the log, or you can change to yarn-client mode to run
You are running in yarn-cluster mode, you need to go to the spark task tracking URL to see the log, or you can change to yarn-client mode to run
Relating to yarn-client mode, do you mean change the settings here?
I tried the client and local mode, it doesn't create the error output file in HDFS too.
Please check logs below.
[INFO] 2022-08-10 13:47:19.317 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[83] - data quality task params {"localParams":[],"resourceList":[],"ruleId":10,"ruleInputParameter":{"check_type":"1","comparison_type":1,"comparison_name":"0","failure_strategy":"0","operator":"3","src_connector_type":5,"src_datasource_id":11,"src_field":null,"src_table":"BW_BI0_TSTOR_LOC","threshold":"0"},"sparkParameters":{"deployMode":"client","driverCores":1,"driverMemory":"512M","executorCores":2,"executorMemory":"2G","numExecutors":2,"others":"--conf spark.yarn.maxAppAttempts=1"}}
[INFO] 2022-08-10 13:47:19.323 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[181] - data quality task command: ${SPARK_HOME2}/bin/spark-submit --master yarn --deploy-mode client --driver-cores 1 --driver-memory 512M --num-executors 2 --executor-cores 2 --executor-memory 2G --queue default --conf spark.yarn.maxAppAttempts=1 /home/dolphinscheduler/dolphinscheduler/worker-server/libs/dolphinscheduler-data-quality-3.0.0-beta-3-SNAPSHOT.jar "{\"name\":\"$t(table_count_check)\",\"env\":{\"type\":\"batch\",\"config\":null},\"readers\":[{\"type\":\"JDBC\",\"config\":{\"database\":\"BDDB\",\"password\":\"*************\",\"driver\":\"oracle.jdbc.OracleDriver\",\"user\":\"BDDB\",\"output_table\":\"BDDB_BW_BI0_TSTOR_LOC\",\"table\":\"BW_BI0_TSTOR_LOC\",\"url\":\"jdbc:oracle:thin:@//10.97.1.230:1521/BDDB\"} }],\"transformers\":[{\"type\":\"sql\",\"config\":{\"index\":1,\"output_table\":\"table_count\",\"sql\":\"SELECT COUNT(*) AS total FROM BDDB_BW_BI0_TSTOR_LOC \"} }],\"writers\":[{\"type\":\"JDBC\",\"config\":{\"database\":\"sl_ds\",\"password\":\"*************\",\"driver\":\"com.mysql.cj.jdbc.Driver\",\"user\":\"root\",\"table\":\"t_ds_dq_execute_result\",\"url\":\"jdbc:mysql://10.97.1.225:3306/sl_ds?useUnicode=true&characterEncoding=UTF-8&useSSL=false&allowLoadLocalInfile=false&autoDeserialize=false&allowLocalInfile=false&allowUrlInLocalInfile=false\",\"sql\":\"select 0 as rule_type,'$t(table_count_check)' as rule_name,0 as process_definition_id,781 as process_instance_id,1944 as task_instance_id,table_count.total AS statistics_value,0 AS comparison_value,1 AS comparison_type,1 as check_type,0 as threshold,3 as operator,0 as failure_strategy,'hdfs://haNameservice:8020/tmp/data-quality-error-data/0_781_chris_data_quality_test' as error_output_path,'2022-08-10 13:47:19' as create_time,'2022-08-10 13:47:19' as update_time from table_count \"} },{\"type\":\"JDBC\",\"config\":{\"database\":\"sl_ds\",\"password\":\"*************\",\"driver\":\"com.mysql.cj.jdbc.Driver\",\"user\":\"root\",\"table\":\"t_ds_dq_task_statistics_value\",\"url\":\"jdbc:mysql://10.97.1.225:3306/sl_ds?useUnicode=true&characterEncoding=UTF-8&useSSL=false&allowLoadLocalInfile=false&autoDeserialize=false&allowLocalInfile=false&allowUrlInLocalInfile=false\",\"sql\":\"select 0 as process_definition_id,1944 as task_instance_id,10 as rule_id,'I+PSCKKFG0Y7KVBI3J8DFQ1CVEDLPYJBINDXQERK7AU=' as unique_code,'table_count.total'AS statistics_name,table_count.total AS statistics_value,'2022-08-10 13:47:19' as data_time,'2022-08-10 13:47:19' as create_time,'2022-08-10 13:47:19' as update_time from table_count\"} }]}"
[INFO] 2022-08-10 13:47:19.323 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[85] - tenantCode user:dolphinscheduler, task dir:781_1944
[INFO] 2022-08-10 13:47:19.323 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[90] - create command file:/tmp/dolphinscheduler/exec/process/6001262888864/6277368089120_7/781/1944/781_1944.command
[INFO] 2022-08-10 13:47:19.323 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[116] - command : #!/bin/sh
BASEDIR=$(cd `dirname $0`; pwd)
cd $BASEDIR
source /home/dolphinscheduler/dolphinscheduler/worker-server/conf/dolphinscheduler_env.sh
${SPARK_HOME2}/bin/spark-submit --master yarn --deploy-mode client --driver-cores 1 --driver-memory 512M --num-executors 2 --executor-cores 2 --executor-memory 2G --queue default --conf spark.yarn.maxAppAttempts=1 /home/dolphinscheduler/dolphinscheduler/worker-server/libs/dolphinscheduler-data-quality-3.0.0-beta-3-SNAPSHOT.jar "{\"name\":\"$t(table_count_check)\",\"env\":{\"type\":\"batch\",\"config\":null},\"readers\":[{\"type\":\"JDBC\",\"config\":{\"database\":\"BDDB\",\"password\":\"*************\",\"driver\":\"oracle.jdbc.OracleDriver\",\"user\":\"BDDB\",\"output_table\":\"BDDB_BW_BI0_TSTOR_LOC\",\"table\":\"BW_BI0_TSTOR_LOC\",\"url\":\"jdbc:oracle:thin:@//10.97.1.230:1521/BDDB\"} }],\"transformers\":[{\"type\":\"sql\",\"config\":{\"index\":1,\"output_table\":\"table_count\",\"sql\":\"SELECT COUNT(*) AS total FROM BDDB_BW_BI0_TSTOR_LOC \"} }],\"writers\":[{\"type\":\"JDBC\",\"config\":{\"database\":\"sl_ds\",\"password\":\"*************\",\"driver\":\"com.mysql.cj.jdbc.Driver\",\"user\":\"root\",\"table\":\"t_ds_dq_execute_result\",\"url\":\"jdbc:mysql://10.97.1.225:3306/sl_ds?useUnicode=true&characterEncoding=UTF-8&useSSL=false&allowLoadLocalInfile=false&autoDeserialize=false&allowLocalInfile=false&allowUrlInLocalInfile=false\",\"sql\":\"select 0 as rule_type,'$t(table_count_check)' as rule_name,0 as process_definition_id,781 as process_instance_id,1944 as task_instance_id,table_count.total AS statistics_value,0 AS comparison_value,1 AS comparison_type,1 as check_type,0 as threshold,3 as operator,0 as failure_strategy,'hdfs://haNameservice:8020/tmp/data-quality-error-data/0_781_chris_data_quality_test' as error_output_path,'2022-08-10 13:47:19' as create_time,'2022-08-10 13:47:19' as update_time from table_count \"} },{\"type\":\"JDBC\",\"config\":{\"database\":\"sl_ds\",\"password\":\"*************\",\"driver\":\"com.mysql.cj.jdbc.Driver\",\"user\":\"root\",\"table\":\"t_ds_dq_task_statistics_value\",\"url\":\"jdbc:mysql://10.97.1.225:3306/sl_ds?useUnicode=true&characterEncoding=UTF-8&useSSL=false&allowLoadLocalInfile=false&autoDeserialize=false&allowLocalInfile=false&allowUrlInLocalInfile=false\",\"sql\":\"select 0 as process_definition_id,1944 as task_instance_id,10 as rule_id,'I+PSCKKFG0Y7KVBI3J8DFQ1CVEDLPYJBINDXQERK7AU=' as unique_code,'table_count.total'AS statistics_name,table_count.total AS statistics_value,'2022-08-10 13:47:19' as data_time,'2022-08-10 13:47:19' as create_time,'2022-08-10 13:47:19' as update_time from table_count\"} }]}"
[INFO] 2022-08-10 13:47:19.325 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[290] - task run command: sudo -u dolphinscheduler sh /tmp/dolphinscheduler/exec/process/6001262888864/6277368089120_7/781/1944/781_1944.command
[INFO] 2022-08-10 13:47:19.326 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[181] - process start, process id is: 19801
[INFO] 2022-08-10 13:47:20.327 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> WARNING: User-defined SPARK_HOME (/opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/spark) overrides detected (/opt/cloudera/parcels/CDH/lib/spark).
WARNING: Running spark-class from user-defined location.
[INFO] 2022-08-10 13:47:21.328 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/10 13:47:20 INFO spark.SparkContext: Running Spark version 2.4.0-cdh6.3.2
22/08/10 13:47:20 INFO logging.DriverLogger: Added a local log appender at: /tmp/spark-8a540c14-7e08-499f-9502-cd9c66145346/__driver_logs__/driver.log
22/08/10 13:47:20 INFO spark.SparkContext: Submitted application: (table_count_check)
22/08/10 13:47:20 INFO spark.SecurityManager: Changing view acls to: dolphinscheduler
22/08/10 13:47:20 INFO spark.SecurityManager: Changing modify acls to: dolphinscheduler
22/08/10 13:47:20 INFO spark.SecurityManager: Changing view acls groups to:
22/08/10 13:47:20 INFO spark.SecurityManager: Changing modify acls groups to:
22/08/10 13:47:20 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(dolphinscheduler); groups with view permissions: Set(); users with modify permissions: Set(dolphinscheduler); groups with modify permissions: Set()
22/08/10 13:47:21 INFO util.Utils: Successfully started service 'sparkDriver' on port 42815.
22/08/10 13:47:21 INFO spark.SparkEnv: Registering MapOutputTracker
22/08/10 13:47:21 INFO spark.SparkEnv: Registering BlockManagerMaster
22/08/10 13:47:21 INFO storage.BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
22/08/10 13:47:21 INFO storage.BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
22/08/10 13:47:21 INFO storage.DiskBlockManager: Created local directory at /tmp/blockmgr-9ca3469c-4251-4540-981a-3a13ef674c63
22/08/10 13:47:21 INFO memory.MemoryStore: MemoryStore started with capacity 93.3 MB
22/08/10 13:47:21 INFO spark.SparkEnv: Registering OutputCommitCoordinator
22/08/10 13:47:21 INFO util.log: Logging initialized @1647ms
22/08/10 13:47:21 INFO server.Server: jetty-9.3.z-SNAPSHOT, build timestamp: 2018-09-05T05:11:46+08:00, git hash: 3ce520221d0240229c862b122d2b06c12a625732
22/08/10 13:47:21 INFO server.Server: Started @1714ms
22/08/10 13:47:21 INFO server.AbstractConnector: Started ServerConnector@352e612e{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
22/08/10 13:47:21 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.
22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@2f6bcf87{/jobs,null,AVAILABLE,@Spark}
22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6e92c6ad{/jobs/json,null,AVAILABLE,@Spark}
22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@2fb5fe30{/jobs/job,null,AVAILABLE,@Spark}
22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@5baaae4c{/jobs/job/json,null,AVAILABLE,@Spark}
22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@5b6e8f77{/stages,null,AVAILABLE,@Spark}
22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@41a6d121{/stages/json,null,AVAILABLE,@Spark}
22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4f449e8f{/stages/stage,null,AVAILABLE,@Spark}
22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@19f040ba{/stages/stage/json,null,AVAILABLE,@Spark}
22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@72ab05ed{/stages/pool,null,AVAILABLE,@Spark}
22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@27e32fe4{/stages/pool/json,null,AVAILABLE,@Spark}
22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@c3c4c1c{/storage,null,AVAILABLE,@Spark}
22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@17d238b1{/storage/json,null,AVAILABLE,@Spark}
22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3d7cc3cb{/storage/rdd,null,AVAILABLE,@Spark}
22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@35e478f{/storage/rdd/json,null,AVAILABLE,@Spark}
22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6d6cb754{/environment,null,AVAILABLE,@Spark}
22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6b7d1df8{/environment/json,null,AVAILABLE,@Spark}
22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3044e9c7{/executors,null,AVAILABLE,@Spark}
22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@41d7b27f{/executors/json,null,AVAILABLE,@Spark}
22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@49096b06{/executors/threadDump,null,AVAILABLE,@Spark}
22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4a183d02{/executors/threadDump/json,null,AVAILABLE,@Spark}
22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@5d05ef57{/static,null,AVAILABLE,@Spark}
22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@34237b90{/,null,AVAILABLE,@Spark}
22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1d01dfa5{/api,null,AVAILABLE,@Spark}
22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@31ff1390{/jobs/job/kill,null,AVAILABLE,@Spark}
22/08/10 13:47:21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@759d81f3{/stages/stage/kill,null,AVAILABLE,@Spark}
22/08/10 13:47:21 INFO ui.SparkUI: Bound SparkUI to 0.0.0.0, and started at http://slbdcompute3:4040
[INFO] 2022-08-10 13:47:22.329 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/10 13:47:21 INFO spark.SparkContext: Added JAR file:/home/dolphinscheduler/dolphinscheduler/worker-server/libs/dolphinscheduler-data-quality-3.0.0-beta-3-SNAPSHOT.jar at spark://slbdcompute3:42815/jars/dolphinscheduler-data-quality-3.0.0-beta-3-SNAPSHOT.jar with timestamp 1660110441358
22/08/10 13:47:21 INFO yarn.SparkRackResolver: Got an error when resolving hostNames. Falling back to /default-rack for all
22/08/10 13:47:21 INFO util.Utils: Using initial executors = 2, max of spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors and spark.executor.instances
22/08/10 13:47:22 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm124
22/08/10 13:47:22 INFO yarn.Client: Requesting a new application from cluster with 3 NodeManagers
22/08/10 13:47:22 INFO conf.Configuration: resource-types.xml not found
22/08/10 13:47:22 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
22/08/10 13:47:22 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (61440 MB per container)
22/08/10 13:47:22 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
22/08/10 13:47:22 INFO yarn.Client: Setting up container launch context for our AM
22/08/10 13:47:22 INFO yarn.Client: Setting up the launch environment for our AM container
22/08/10 13:47:22 INFO yarn.Client: Preparing resources for our AM container
[INFO] 2022-08-10 13:47:23.330 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/10 13:47:22 INFO yarn.SparkRackResolver: Got an error when resolving hostNames. Falling back to /default-rack for all
22/08/10 13:47:22 INFO yarn.Client: Uploading resource file:/tmp/spark-8a540c14-7e08-499f-9502-cd9c66145346/__spark_conf__1935499196749650246.zip -> hdfs://haNameservice/user/dolphinscheduler/.sparkStaging/application_1657523889744_0961/__spark_conf__.zip
22/08/10 13:47:22 INFO spark.SecurityManager: Changing view acls to: dolphinscheduler
22/08/10 13:47:22 INFO spark.SecurityManager: Changing modify acls to: dolphinscheduler
22/08/10 13:47:22 INFO spark.SecurityManager: Changing view acls groups to:
22/08/10 13:47:22 INFO spark.SecurityManager: Changing modify acls groups to:
22/08/10 13:47:22 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(dolphinscheduler); groups with view permissions: Set(); users with modify permissions: Set(dolphinscheduler); groups with modify permissions: Set()
22/08/10 13:47:22 INFO yarn.Client: Submitting application application_1657523889744_0961 to ResourceManager
22/08/10 13:47:23 INFO impl.YarnClientImpl: Submitted application application_1657523889744_0961
[INFO] 2022-08-10 13:47:24.331 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/10 13:47:23 INFO yarn.SparkRackResolver: Got an error when resolving hostNames. Falling back to /default-rack for all
22/08/10 13:47:24 INFO yarn.Client: Application report for application_1657523889744_0961 (state: ACCEPTED)
22/08/10 13:47:24 INFO yarn.Client:
client token: N/A
diagnostics: AM container is launched, waiting for AM container to Register with RM
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: root.users.dolphinscheduler
start time: 1660110442880
final status: UNDEFINED
tracking URL: http://slbdprimary2:8088/proxy/application_1657523889744_0961/
user: dolphinscheduler
[INFO] 2022-08-10 13:47:25.332 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/10 13:47:24 INFO yarn.SparkRackResolver: Got an error when resolving hostNames. Falling back to /default-rack for all
22/08/10 13:47:25 INFO yarn.Client: Application report for application_1657523889744_0961 (state: ACCEPTED)
[INFO] 2022-08-10 13:47:26.333 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/10 13:47:25 INFO yarn.SparkRackResolver: Got an error when resolving hostNames. Falling back to /default-rack for all
22/08/10 13:47:25 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> slbdprimary1,slbdprimary2, PROXY_URI_BASES -> http://slbdprimary1:8088/proxy/application_1657523889744_0961,http://slbdprimary2:8088/proxy/application_1657523889744_0961, RM_HA_URLS -> slbdprimary1:8088,slbdprimary2:8088), /proxy/application_1657523889744_0961
22/08/10 13:47:25 INFO ui.JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /jobs, /jobs/json, /jobs/job, /jobs/job/json, /stages, /stages/json, /stages/stage, /stages/stage/json, /stages/pool, /stages/pool/json, /storage, /storage/json, /storage/rdd, /storage/rdd/json, /environment, /environment/json, /executors, /executors/json, /executors/threadDump, /executors/threadDump/json, /static, /, /api, /jobs/job/kill, /stages/stage/kill.
22/08/10 13:47:26 INFO yarn.Client: Application report for application_1657523889744_0961 (state: RUNNING)
22/08/10 13:47:26 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: 10.97.1.228
ApplicationMaster RPC port: -1
queue: root.users.dolphinscheduler
start time: 1660110442880
final status: UNDEFINED
tracking URL: http://slbdprimary2:8088/proxy/application_1657523889744_0961/
user: dolphinscheduler
22/08/10 13:47:26 INFO cluster.YarnClientSchedulerBackend: Application application_1657523889744_0961 has started running.
22/08/10 13:47:26 INFO cluster.SchedulerExtensionServices: Starting Yarn extension services with app application_1657523889744_0961 and attemptId None
22/08/10 13:47:26 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 45789.
22/08/10 13:47:26 INFO netty.NettyBlockTransferService: Server created on slbdcompute3:45789
22/08/10 13:47:26 INFO storage.BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
22/08/10 13:47:26 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(spark-client://YarnAM)
22/08/10 13:47:26 INFO storage.BlockManagerMaster: Registering BlockManager BlockManagerId(driver, slbdcompute3, 45789, None)
22/08/10 13:47:26 INFO storage.BlockManagerMasterEndpoint: Registering block manager slbdcompute3:45789 with 93.3 MB RAM, BlockManagerId(driver, slbdcompute3, 45789, None)
22/08/10 13:47:26 INFO storage.BlockManagerMaster: Registered BlockManager BlockManagerId(driver, slbdcompute3, 45789, None)
22/08/10 13:47:26 INFO storage.BlockManager: external shuffle service port = 7337
22/08/10 13:47:26 INFO storage.BlockManager: Initialized BlockManager: BlockManagerId(driver, slbdcompute3, 45789, None)
22/08/10 13:47:26 INFO ui.JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /metrics/json.
22/08/10 13:47:26 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6824b913{/metrics/json,null,AVAILABLE,@Spark}
22/08/10 13:47:26 INFO scheduler.EventLoggingListener: Logging events to hdfs://haNameservice/user/spark/applicationHistory/application_1657523889744_0961
22/08/10 13:47:26 INFO util.Utils: Using initial executors = 2, max of spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors and spark.executor.instances
[INFO] 2022-08-10 13:47:27.334 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/10 13:47:26 WARN lineage.LineageWriter: Lineage directory /var/log/spark/lineage doesn't exist or is not writable. Lineage for this application will be disabled.
22/08/10 13:47:26 INFO util.Utils: Extension com.cloudera.spark.lineage.NavigatorAppListener not being initialized.
22/08/10 13:47:26 INFO logging.DriverLogger$DfsAsyncWriter: Started driver log file sync to: /user/spark/driverLogs/application_1657523889744_0961_driver.log
22/08/10 13:47:26 INFO yarn.SparkRackResolver: Got an error when resolving hostNames. Falling back to /default-rack for all
[INFO] 2022-08-10 13:47:28.335 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/10 13:47:27 INFO yarn.SparkRackResolver: Got an error when resolving hostNames. Falling back to /default-rack for all
[INFO] 2022-08-10 13:47:29.336 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/10 13:47:28 INFO yarn.SparkRackResolver: Got an error when resolving hostNames. Falling back to /default-rack for all
22/08/10 13:47:29 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.97.1.229:59972) with ID 1
22/08/10 13:47:29 INFO spark.ExecutorAllocationManager: New executor 1 has registered (new total is 1)
22/08/10 13:47:29 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.97.1.229:59970) with ID 2
22/08/10 13:47:29 INFO spark.ExecutorAllocationManager: New executor 2 has registered (new total is 2)
22/08/10 13:47:29 INFO cluster.YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
22/08/10 13:47:29 INFO storage.BlockManagerMasterEndpoint: Registering block manager slbdcompute3:45518 with 912.3 MB RAM, BlockManagerId(1, slbdcompute3, 45518, None)
22/08/10 13:47:29 INFO storage.BlockManagerMasterEndpoint: Registering block manager slbdcompute3:37888 with 912.3 MB RAM, BlockManagerId(2, slbdcompute3, 37888, None)
[INFO] 2022-08-10 13:47:30.337 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/10 13:47:29 INFO internal.SharedState: loading hive config file: file:/etc/hive/conf.cloudera.hive/hive-site.xml
22/08/10 13:47:29 INFO internal.SharedState: spark.sql.warehouse.dir is not set, but hive.metastore.warehouse.dir is set. Setting spark.sql.warehouse.dir to the value of hive.metastore.warehouse.dir ('/user/hive/warehouse').
22/08/10 13:47:29 INFO internal.SharedState: Warehouse path is '/user/hive/warehouse'.
22/08/10 13:47:29 INFO ui.JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL.
22/08/10 13:47:29 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@180b3819{/SQL,null,AVAILABLE,@Spark}
22/08/10 13:47:29 INFO ui.JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/json.
22/08/10 13:47:29 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@47272cd3{/SQL/json,null,AVAILABLE,@Spark}
22/08/10 13:47:29 INFO ui.JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/execution.
22/08/10 13:47:29 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@707ca986{/SQL/execution,null,AVAILABLE,@Spark}
22/08/10 13:47:29 INFO ui.JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/execution/json.
22/08/10 13:47:29 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@183ade54{/SQL/execution/json,null,AVAILABLE,@Spark}
22/08/10 13:47:29 INFO ui.JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /static/sql.
22/08/10 13:47:29 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@5e26f1ed{/static/sql,null,AVAILABLE,@Spark}
22/08/10 13:47:29 INFO state.StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint
22/08/10 13:47:29 WARN lineage.LineageWriter: Lineage directory /var/log/spark/lineage doesn't exist or is not writable. Lineage for this application will be disabled.
22/08/10 13:47:29 INFO util.Utils: Extension com.cloudera.spark.lineage.NavigatorQueryListener not being initialized.
[INFO] 2022-08-10 13:47:34.339 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/10 13:47:33 INFO conf.HiveConf: Found configuration file file:/etc/hive/conf.cloudera.hive/hive-site.xml
22/08/10 13:47:33 INFO hive.HiveUtils: Initializing HiveMetastoreConnection version 2.1 using Spark classes.
22/08/10 13:47:33 INFO conf.HiveConf: Found configuration file file:/etc/hive/conf.cloudera.hive/hive-site.xml
22/08/10 13:47:33 INFO session.SessionState: Created HDFS directory: /tmp/hive/dolphinscheduler/8f5ab21b-92bf-47e1-9a19-3e956c8d437e
22/08/10 13:47:33 INFO session.SessionState: Created local directory: /tmp/dolphinscheduler/8f5ab21b-92bf-47e1-9a19-3e956c8d437e
22/08/10 13:47:33 INFO session.SessionState: Created HDFS directory: /tmp/hive/dolphinscheduler/8f5ab21b-92bf-47e1-9a19-3e956c8d437e/_tmp_space.db
22/08/10 13:47:33 INFO client.HiveClientImpl: Warehouse location for Hive client (version 2.1.1) is /user/hive/warehouse
22/08/10 13:47:34 INFO hive.metastore: HMS client filtering is enabled.
22/08/10 13:47:34 INFO hive.metastore: Trying to connect to metastore with URI thrift://slbdprimary1:9083
22/08/10 13:47:34 INFO hive.metastore: Opened a connection to metastore, current connections: 1
22/08/10 13:47:34 INFO hive.metastore: Connected to metastore.
[INFO] 2022-08-10 13:47:35.340 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/10 13:47:35 INFO codegen.CodeGenerator: Code generated in 170.213585 ms
22/08/10 13:47:35 INFO codegen.CodeGenerator: Code generated in 9.4946 ms
22/08/10 13:47:35 INFO spark.SparkContext: Starting job: save at JdbcWriter.java:85
[INFO] 2022-08-10 13:47:36.341 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/10 13:47:35 INFO scheduler.DAGScheduler: Registering RDD 2 (save at JdbcWriter.java:85)
22/08/10 13:47:35 INFO spark.ContextCleaner: Cleaned accumulator 0
22/08/10 13:47:35 INFO scheduler.DAGScheduler: Got job 0 (save at JdbcWriter.java:85) with 1 output partitions
22/08/10 13:47:35 INFO scheduler.DAGScheduler: Final stage: ResultStage 1 (save at JdbcWriter.java:85)
22/08/10 13:47:35 INFO scheduler.DAGScheduler: Parents of final stage: List(ShuffleMapStage 0)
22/08/10 13:47:35 INFO scheduler.DAGScheduler: Missing parents: List(ShuffleMapStage 0)
22/08/10 13:47:35 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 0 (MapPartitionsRDD[2] at save at JdbcWriter.java:85), which has no missing parents
22/08/10 13:47:35 INFO memory.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 9.4 KB, free 93.3 MB)
22/08/10 13:47:35 INFO memory.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 5.1 KB, free 93.3 MB)
22/08/10 13:47:35 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on slbdcompute3:45789 (size: 5.1 KB, free: 93.3 MB)
22/08/10 13:47:35 INFO spark.SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1164
22/08/10 13:47:35 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from ShuffleMapStage 0 (MapPartitionsRDD[2] at save at JdbcWriter.java:85) (first 15 tasks are for partitions Vector(0))
22/08/10 13:47:35 INFO cluster.YarnScheduler: Adding task set 0.0 with 1 tasks
22/08/10 13:47:35 INFO yarn.SparkRackResolver: Got an error when resolving hostNames. Falling back to /default-rack for all
22/08/10 13:47:35 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, slbdcompute3, executor 1, partition 0, PROCESS_LOCAL, 7690 bytes)
22/08/10 13:47:35 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on slbdcompute3:45518 (size: 5.1 KB, free: 912.3 MB)
[INFO] 2022-08-10 13:47:38.342 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/10 13:47:37 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 2315 ms on slbdcompute3 (executor 1) (1/1)
22/08/10 13:47:37 INFO cluster.YarnScheduler: Removed TaskSet 0.0, whose tasks have all completed, from pool
22/08/10 13:47:37 INFO scheduler.DAGScheduler: ShuffleMapStage 0 (save at JdbcWriter.java:85) finished in 2.582 s
22/08/10 13:47:37 INFO scheduler.DAGScheduler: looking for newly runnable stages
22/08/10 13:47:37 INFO scheduler.DAGScheduler: running: Set()
22/08/10 13:47:37 INFO scheduler.DAGScheduler: waiting: Set(ResultStage 1)
22/08/10 13:47:37 INFO scheduler.DAGScheduler: failed: Set()
22/08/10 13:47:37 INFO scheduler.DAGScheduler: Submitting ResultStage 1 (MapPartitionsRDD[6] at save at JdbcWriter.java:85), which has no missing parents
22/08/10 13:47:38 INFO memory.MemoryStore: Block broadcast_1 stored as values in memory (estimated size 22.2 KB, free 93.3 MB)
22/08/10 13:47:38 INFO memory.MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 10.6 KB, free 93.3 MB)
22/08/10 13:47:38 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on slbdcompute3:45789 (size: 10.6 KB, free: 93.3 MB)
22/08/10 13:47:38 INFO spark.SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1164
22/08/10 13:47:38 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from ResultStage 1 (MapPartitionsRDD[6] at save at JdbcWriter.java:85) (first 15 tasks are for partitions Vector(0))
22/08/10 13:47:38 INFO cluster.YarnScheduler: Adding task set 1.0 with 1 tasks
22/08/10 13:47:38 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 1.0 (TID 1, slbdcompute3, executor 2, partition 0, NODE_LOCAL, 7778 bytes)
22/08/10 13:47:38 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on slbdcompute3:37888 (size: 10.6 KB, free: 912.3 MB)
[INFO] 2022-08-10 13:47:39.343 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/10 13:47:38 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 0 to 10.97.1.229:59970
[INFO] 2022-08-10 13:47:40.344 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/10 13:47:40 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 1.0 (TID 1) in 2222 ms on slbdcompute3 (executor 2) (1/1)
22/08/10 13:47:40 INFO cluster.YarnScheduler: Removed TaskSet 1.0, whose tasks have all completed, from pool
22/08/10 13:47:40 INFO scheduler.DAGScheduler: ResultStage 1 (save at JdbcWriter.java:85) finished in 2.271 s
22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 27
22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 35
22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 36
22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 40
22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 31
22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 34
22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 16
22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 26
22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 20
22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 17
22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 25
22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 21
22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 19
22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 15
22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 37
22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 30
22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 24
22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 28
22/08/10 13:47:40 INFO scheduler.DAGScheduler: Job 0 finished: save at JdbcWriter.java:85, took 4.922278 s
22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 33
22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 29
22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 32
22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 18
22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 22
22/08/10 13:47:40 INFO storage.BlockManagerInfo: Removed broadcast_0_piece0 on slbdcompute3:45789 in memory (size: 5.1 KB, free: 93.3 MB)
22/08/10 13:47:40 INFO storage.BlockManagerInfo: Removed broadcast_0_piece0 on slbdcompute3:45518 in memory (size: 5.1 KB, free: 912.3 MB)
22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 23
22/08/10 13:47:40 INFO storage.BlockManagerInfo: Removed broadcast_1_piece0 on slbdcompute3:45789 in memory (size: 10.6 KB, free: 93.3 MB)
22/08/10 13:47:40 INFO storage.BlockManagerInfo: Removed broadcast_1_piece0 on slbdcompute3:37888 in memory (size: 10.6 KB, free: 912.3 MB)
22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 38
22/08/10 13:47:40 INFO spark.ContextCleaner: Cleaned accumulator 39
22/08/10 13:47:40 INFO codegen.CodeGenerator: Code generated in 13.288355 ms
[INFO] 2022-08-10 13:47:41.345 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/10 13:47:40 INFO spark.SparkContext: Starting job: save at JdbcWriter.java:85
22/08/10 13:47:40 INFO scheduler.DAGScheduler: Registering RDD 10 (save at JdbcWriter.java:85)
22/08/10 13:47:40 INFO scheduler.DAGScheduler: Got job 1 (save at JdbcWriter.java:85) with 1 output partitions
22/08/10 13:47:40 INFO scheduler.DAGScheduler: Final stage: ResultStage 3 (save at JdbcWriter.java:85)
22/08/10 13:47:40 INFO scheduler.DAGScheduler: Parents of final stage: List(ShuffleMapStage 2)
22/08/10 13:47:40 INFO scheduler.DAGScheduler: Missing parents: List(ShuffleMapStage 2)
22/08/10 13:47:40 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 2 (MapPartitionsRDD[10] at save at JdbcWriter.java:85), which has no missing parents
22/08/10 13:47:40 INFO memory.MemoryStore: Block broadcast_2 stored as values in memory (estimated size 9.4 KB, free 93.3 MB)
22/08/10 13:47:40 INFO memory.MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 5.1 KB, free 93.3 MB)
22/08/10 13:47:40 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on slbdcompute3:45789 (size: 5.1 KB, free: 93.3 MB)
22/08/10 13:47:40 INFO spark.SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:1164
22/08/10 13:47:40 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from ShuffleMapStage 2 (MapPartitionsRDD[10] at save at JdbcWriter.java:85) (first 15 tasks are for partitions Vector(0))
22/08/10 13:47:40 INFO cluster.YarnScheduler: Adding task set 2.0 with 1 tasks
22/08/10 13:47:40 INFO yarn.SparkRackResolver: Got an error when resolving hostNames. Falling back to /default-rack for all
22/08/10 13:47:40 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 2.0 (TID 2, slbdcompute3, executor 1, partition 0, PROCESS_LOCAL, 7690 bytes)
22/08/10 13:47:40 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on slbdcompute3:45518 (size: 5.1 KB, free: 912.3 MB)
[INFO] 2022-08-10 13:47:42.346 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[63] - -> 22/08/10 13:47:41 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 2.0 (TID 2) in 1512 ms on slbdcompute3 (executor 1) (1/1)
22/08/10 13:47:41 INFO cluster.YarnScheduler: Removed TaskSet 2.0, whose tasks have all completed, from pool
22/08/10 13:47:41 INFO scheduler.DAGScheduler: ShuffleMapStage 2 (save at JdbcWriter.java:85) finished in 1.523 s
22/08/10 13:47:41 INFO scheduler.DAGScheduler: looking for newly runnable stages
22/08/10 13:47:41 INFO scheduler.DAGScheduler: running: Set()
22/08/10 13:47:41 INFO scheduler.DAGScheduler: waiting: Set(ResultStage 3)
22/08/10 13:47:41 INFO scheduler.DAGScheduler: failed: Set()
22/08/10 13:47:41 INFO scheduler.DAGScheduler: Submitting ResultStage 3 (MapPartitionsRDD[14] at save at JdbcWriter.java:85), which has no missing parents
22/08/10 13:47:41 INFO memory.MemoryStore: Block broadcast_3 stored as values in memory (estimated size 21.1 KB, free 93.3 MB)
22/08/10 13:47:41 INFO memory.MemoryStore: Block broadcast_3_piece0 stored as bytes in memory (estimated size 10.2 KB, free 93.3 MB)
22/08/10 13:47:41 INFO storage.BlockManagerInfo: Added broadcast_3_piece0 in memory on slbdcompute3:45789 (size: 10.2 KB, free: 93.3 MB)
22/08/10 13:47:41 INFO spark.SparkContext: Created broadcast 3 from broadcast at DAGScheduler.scala:1164
22/08/10 13:47:41 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from ResultStage 3 (MapPartitionsRDD[14] at save at JdbcWriter.java:85) (first 15 tasks are for partitions Vector(0))
22/08/10 13:47:41 INFO cluster.YarnScheduler: Adding task set 3.0 with 1 tasks
22/08/10 13:47:41 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 3.0 (TID 3, slbdcompute3, executor 2, partition 0, NODE_LOCAL, 7778 bytes)
22/08/10 13:47:41 INFO storage.BlockManagerInfo: Added broadcast_3_piece0 in memory on slbdcompute3:37888 (size: 10.2 KB, free: 912.3 MB)
22/08/10 13:47:41 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 1 to 10.97.1.229:59970
22/08/10 13:47:41 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 3.0 (TID 3) in 90 ms on slbdcompute3 (executor 2) (1/1)
22/08/10 13:47:41 INFO cluster.YarnScheduler: Removed TaskSet 3.0, whose tasks have all completed, from pool
22/08/10 13:47:41 INFO scheduler.DAGScheduler: ResultStage 3 (save at JdbcWriter.java:85) finished in 0.105 s
22/08/10 13:47:41 INFO scheduler.DAGScheduler: Job 1 finished: save at JdbcWriter.java:85, took 1.631793 s
22/08/10 13:47:42 INFO server.AbstractConnector: Stopped Spark@352e612e{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
22/08/10 13:47:42 INFO ui.SparkUI: Stopped Spark web UI at http://slbdcompute3:4040
22/08/10 13:47:42 INFO cluster.YarnClientSchedulerBackend: Interrupting monitor thread
22/08/10 13:47:42 INFO cluster.YarnClientSchedulerBackend: Shutting down all executors
22/08/10 13:47:42 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Asking each executor to shut down
22/08/10 13:47:42 INFO cluster.SchedulerExtensionServices: Stopping SchedulerExtensionServices
(serviceOption=None,
services=List(),
started=false)
22/08/10 13:47:42 INFO cluster.YarnClientSchedulerBackend: Stopped
22/08/10 13:47:42 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
22/08/10 13:47:42 INFO memory.MemoryStore: MemoryStore cleared
22/08/10 13:47:42 INFO storage.BlockManager: BlockManager stopped
22/08/10 13:47:42 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
22/08/10 13:47:42 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
22/08/10 13:47:42 INFO spark.SparkContext: Successfully stopped SparkContext
22/08/10 13:47:42 INFO util.ShutdownHookManager: Shutdown hook called
22/08/10 13:47:42 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-8a540c14-7e08-499f-9502-cd9c66145346
22/08/10 13:47:42 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-7b6d5178-43bb-4647-88d9-fc67b100d784
[INFO] 2022-08-10 13:47:42.469 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[375] - find app id: application_1657523889744_0961
[INFO] 2022-08-10 13:47:42.469 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[205] - process has exited, execute path:/tmp/dolphinscheduler/exec/process/6001262888864/6277368089120_7/781/1944, processId:19801 ,exitStatusCode:0 ,processWaitForStatus:true ,processExitValue:0
[INFO] 2022-08-10 13:47:43.346 +0800 [taskAppId=TASK-20220810-6277368089120_7-781-1944] TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.dq.DataQualityTask:[57] - FINALIZE_SESSION
Task Result snapshot
HDFS snapshot
It seems that it is designed like this, "table_count_check" doesn't output error data, compared with "null_check" and found that it is because the value of "errorOutputSql" is different. The value of the "errorOutputSql" variable is "false" when the "table_count_check" rule is used. The value of the "errorOutputSql" variable is "true" when the "null_check" rule is used.
You can try to use the "null_check" rule, and then compare the execution commands in the log, where the "writers" list will have an output type of "hdfs_file".
It seems that it is designed like this, "table_count_check" doesn't output error data, compared with "null_check" and found that it is because the value of "errorOutputSql" is different. The value of the "errorOutputSql" variable is "false" when the "table_count_check" rule is used. The value of the "errorOutputSql" variable is "true" when the "null_check" rule is used.
You can try to use the "null_check" rule, and then compare the execution commands in the log, where the "writers" list will have an output type of "hdfs_file".
Thanks, it works.
I think maybe it would be more friendly to leave the Error Output Path
blank if there is no output file.
You're welcome. I agree with you, maybe you can create a new issue to discuss this.
@SbloodyS Should we keep the Error Output Path
empty if the task is table_count_check
and etc. to eliminate the misunderstanding?
This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.