dolphinscheduler icon indicating copy to clipboard operation
dolphinscheduler copied to clipboard

[Bug] [Data Quality] Multi Table Value Comparison Rule exists a error when one table in mysql another in doris.

Open YUTING0907 opened this issue 1 year ago • 0 comments

Search before asking

  • [X] I had searched in the issues and found no similar issues.

What happened

Multi Table Value Comparison Data Quality Rule exists a error when one table in mysql another in different database, which is Doris. Needless to say, mysql datasource and doris datasource connections in DS is totally well. But it is strange that Data Quality tried to connect hadoop cluster with mysql datasource username.

loginfo: org.apache.dolphinscheduler.data.quality.execution.SparkBatchExecution.lambda$execute$0(SparkBatchExecution.java:48) at java.util.ArrayList.forEach(ArrayList.java:1257) at org.apache.dolphinscheduler.data.quality.execution.SparkBatchExecution.execute(SparkBatchExecution.java:48) at org.apache.dolphinscheduler.data.quality.context.DataQualityContext.execute(DataQualityContext.java:62) at org.apache.dolphinscheduler.data.quality.DataQualityApplication.main(DataQualityApplication.java:70) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:678)

	 ApplicationMaster host: hadooppord04
	 ApplicationMaster RPC port: 37045
	 queue: spark
	 start time: 1706696516152
	 final status: FAILED
	 tracking URL: http://hadooppord01:8088/proxy/application_1703643815865_0370/
	 user: xxxxx
2024-01-31 18:22:07 ERROR Client:70 - Application diagnostics message: User class threw exception: java.sql.SQLException: Access denied for user 'canal'@'10.xx.xx.xx' (using password: YES)
	at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:129)
	at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:97)
	at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:122)
	at com.mysql.cj.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:835)
	at com.mysql.cj.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:455)
	at com.mysql.cj.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:240)
	at com.mysql.cj.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:199)
	at org.apache.spark.sql.execution.datasources.jdbc.DriverWrapper.connect(DriverWrapper.scala:45)
	at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$1.apply(JdbcUtils.scala:63)
	at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$1.apply(JdbcUtils.scala:54)
	at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:56)
	at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation$.getSchema(JDBCRelation.scala:210)
	at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:35)
	at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:318)
	at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223)
	at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)
	at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:167)
	at org.apache.dolphinscheduler.data.quality.flow.batch.reader.JdbcReader.read(JdbcReader.java:72)
	at org.apache.dolphinscheduler.data.quality.execution.SparkBatchExecution.registerInputTempView(SparkBatchExecution.java:77)
	at org.apache.dolphinscheduler.data.quality.execution.SparkBatchExecution.lambda$execute$0(SparkBatchExecution.java:48)
	at java.util.ArrayList.forEach(ArrayList.java:1257)
	at org.apache.dolphinscheduler.data.quality.execution.SparkBatchExecution.execute(SparkBatchExecution.java:48)
	at org.apache.dolphinscheduler.data.quality.context.DataQualityContext.execute(DataQualityContext.java:62)
	at org.apache.dolphinscheduler.data.quality.DataQualityApplication.main(DataQualityApplication.java:70)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:678)

Exception in thread "main" org.apache.spark.SparkException: Application application_1703643815865_0370 finished with failed status
	at org.apache.spark.deploy.yarn.Client.run(Client.scala:1149)
	at org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1526)
	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:849)
	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167)
	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:195)
	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:924)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
2024-01-31 18:22:07 INFO  ShutdownHookManager:54 - Shutdown hook called

What you expected to happen

hope error can be fixed.

How to reproduce

Use Multi Table Value Comparison Data Quality Rule, and choose two table in different database type.

Anything else

No response

Version

dev

Are you willing to submit PR?

  • [ ] Yes I am willing to submit a PR!

Code of Conduct

YUTING0907 avatar Feb 01 '24 07:02 YUTING0907