dolphinscheduler icon indicating copy to clipboard operation
dolphinscheduler copied to clipboard

[Bug] [db init] Got error Caused by: java.sql.SQLSyntaxErrorException: Duplicate column name 'operator' when upgrade to 3.2.2 from 3.2.1 on EKS

Open ahululu opened this issue 1 year ago • 10 comments

Search before asking

  • [X] I had searched in the issues and found no similar issues.

What happened

ENV: EKS1.29

when upgrade to 3.2.2 from 3.2.1 (replace helm image), get error from pod dolphinscheduler-db-init-job-xxxx

xxx  INFO 8 --- [           main] o.a.d.common.sql.SqlScriptRunner         : Execute sql: DROP TABLE IF EXISTS `t_ds_relation_project_worker_group`; success
xxx  INFO 8 --- [           main] o.a.d.common.sql.SqlScriptRunner         : Execute sql: CREATE TABLE `t_ds_relation_project_worker_group` (
  `id` int(11) NOT NULL AUTO_INCREMENT COMMENT 'key',
  `project_code` bigint(20) NOT NULL COMMENT 'project code',
  `worker_group` varchar(255) DEFAULT NULL COMMENT 'worker group',
  `create_time` datetime DEFAULT NULL COMMENT 'create time',
  `update_time` datetime DEFAULT NULL COMMENT 'update time',
  PRIMARY KEY (`id`),
  UNIQUE KEY unique_project_worker_group(project_code,worker_group)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8 COLLATE = utf8_bin; success
xxxx ERROR 8 --- [           main] o.a.d.t.datasource.upgrader.UpgradeDao   : Execute ddl file failed, meet an unknown exception, schemaDir:  3.2.2_schema, ddlScript: dolphinscheduler_ddl.sql

java.sql.SQLSyntaxErrorException: Duplicate column name 'operator'
	at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:120) ~[mysql-connector-j-8.0.32.jar:8.0.32]
	at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:122) ~[mysql-connector-j-8.0.32.jar:8.0.32]
	at com.mysql.cj.jdbc.StatementImpl.executeInternal(StatementImpl.java:763) ~[mysql-connector-j-8.0.32.jar:8.0.32]
	at com.mysql.cj.jdbc.StatementImpl.execute(StatementImpl.java:648) ~[mysql-connector-j-8.0.32.jar:8.0.32]
	at com.zaxxer.hikari.pool.ProxyStatement.execute(ProxyStatement.java:94) ~[HikariCP-4.0.3.jar:na]
	at com.zaxxer.hikari.pool.HikariProxyStatement.execute(HikariProxyStatement.java) ~[HikariCP-4.0.3.jar:na]
	at org.apache.dolphinscheduler.common.sql.SqlScriptRunner.execute(SqlScriptRunner.java:58) ~[dolphinscheduler-common-3.2.2.jar:3.2.2]
	at org.apache.dolphinscheduler.tools.datasource.upgrader.UpgradeDao.upgradeDolphinSchedulerDDL(UpgradeDao.java:154) [dolphinscheduler-tools-3.2.2.jar:3.2.2]
	at org.apache.dolphinscheduler.tools.datasource.upgrader.UpgradeDao.upgradeDolphinScheduler(UpgradeDao.java:89) [dolphinscheduler-tools-3.2.2.jar:3.2.2]
	at org.apache.dolphinscheduler.tools.datasource.DolphinSchedulerManager.upgradeDolphinScheduler(DolphinSchedulerManager.java:111) [dolphinscheduler-tools-3.2.2.jar:3.2.2]
	at org.apache.dolphinscheduler.tools.datasource.UpgradeDolphinScheduler$UpgradeRunner.run(UpgradeDolphinScheduler.java:53) [dolphinscheduler-tools-3.2.2.jar:3.2.2]
	at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:771) [spring-boot-2.7.3.jar:2.7.3]
	at org.springframework.boot.SpringApplication.callRunners(SpringApplication.java:755) [spring-boot-2.7.3.jar:2.7.3]
	at org.springframework.boot.SpringApplication.run(SpringApplication.java:315) [spring-boot-2.7.3.jar:2.7.3]
	at org.springframework.boot.SpringApplication.run(SpringApplication.java:1306) [spring-boot-2.7.3.jar:2.7.3]
	at org.springframework.boot.SpringApplication.run(SpringApplication.java:1295) [spring-boot-2.7.3.jar:2.7.3]
	at org.apache.dolphinscheduler.tools.datasource.UpgradeDolphinScheduler.main(UpgradeDolphinScheduler.java:36) [dolphinscheduler-tools-3.2.2.jar:3.2.2]

xxxxx  INFO 8 --- [           main] ConditionEvaluationReportLoggingListener :

Error starting ApplicationContext. To display the conditions report re-run your application with 'debug' enabled.
xxxx ERROR 8 --- [           main] o.s.boot.SpringApplication               : Application run failed

java.lang.IllegalStateException: Failed to execute CommandLineRunner
	at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:774) [spring-boot-2.7.3.jar:2.7.3]
	at org.springframework.boot.SpringApplication.callRunners(SpringApplication.java:755) [spring-boot-2.7.3.jar:2.7.3]
	at org.springframework.boot.SpringApplication.run(SpringApplication.java:315) [spring-boot-2.7.3.jar:2.7.3]
	at org.springframework.boot.SpringApplication.run(SpringApplication.java:1306) [spring-boot-2.7.3.jar:2.7.3]
	at org.springframework.boot.SpringApplication.run(SpringApplication.java:1295) [spring-boot-2.7.3.jar:2.7.3]
	at org.apache.dolphinscheduler.tools.datasource.UpgradeDolphinScheduler.main(UpgradeDolphinScheduler.java:36) [dolphinscheduler-tools-3.2.2.jar:3.2.2]
Caused by: java.lang.RuntimeException: Execute ddl file failed, meet an unknown exception
	at org.apache.dolphinscheduler.tools.datasource.upgrader.UpgradeDao.upgradeDolphinSchedulerDDL(UpgradeDao.java:162) ~[dolphinscheduler-tools-3.2.2.jar:3.2.2]
	at org.apache.dolphinscheduler.tools.datasource.upgrader.UpgradeDao.upgradeDolphinScheduler(UpgradeDao.java:89) ~[dolphinscheduler-tools-3.2.2.jar:3.2.2]
	at org.apache.dolphinscheduler.tools.datasource.DolphinSchedulerManager.upgradeDolphinScheduler(DolphinSchedulerManager.java:111) ~[dolphinscheduler-tools-3.2.2.jar:3.2.2]
	at org.apache.dolphinscheduler.tools.datasource.UpgradeDolphinScheduler$UpgradeRunner.run(UpgradeDolphinScheduler.java:53) ~[dolphinscheduler-tools-3.2.2.jar:3.2.2]
	at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:771) [spring-boot-2.7.3.jar:2.7.3]
	... 5 common frames omitted
Caused by: java.sql.SQLSyntaxErrorException: Duplicate column name 'operator'
	at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:120) ~[mysql-connector-j-8.0.32.jar:8.0.32]
	at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:122) ~[mysql-connector-j-8.0.32.jar:8.0.32]
	at com.mysql.cj.jdbc.StatementImpl.executeInternal(StatementImpl.java:763) ~[mysql-connector-j-8.0.32.jar:8.0.32]
	at com.mysql.cj.jdbc.StatementImpl.execute(StatementImpl.java:648) ~[mysql-connector-j-8.0.32.jar:8.0.32]
	at com.zaxxer.hikari.pool.ProxyStatement.execute(ProxyStatement.java:94) ~[HikariCP-4.0.3.jar:na]
	at com.zaxxer.hikari.pool.HikariProxyStatement.execute(HikariProxyStatement.java) ~[HikariCP-4.0.3.jar:na]
	at org.apache.dolphinscheduler.common.sql.SqlScriptRunner.execute(SqlScriptRunner.java:58) ~[dolphinscheduler-common-3.2.2.jar:3.2.2]
	at org.apache.dolphinscheduler.tools.datasource.upgrader.UpgradeDao.upgradeDolphinSchedulerDDL(UpgradeDao.java:154) ~[dolphinscheduler-tools-3.2.2.jar:3.2.2]
	... 9 common frames omitted

xxxx  INFO 8 --- [           main] com.zaxxer.hikari.HikariDataSource       : DolphinScheduler - Shutdown initiated...
xxx  INFO 8 --- [           main] com.zaxxer.hikari.HikariDataSource       : DolphinScheduler - Shutdown completed.

What you expected to happen

I don't think db init scripts are well compatible with this kind of image upgrade

How to reproduce

just deploy 3.2.1 by helm chart, then upgrade to 3.2.2

Anything else

No response

Version

3.2.x

Are you willing to submit PR?

  • [X] Yes I am willing to submit a PR!

Code of Conduct

ahululu avatar Jul 24 '24 06:07 ahululu

Maybe the helm cluster 3.2.1 was upgraded to 3.2.2 once before, but there is a problem, So, I use helm rollback , but the database will not roll. Now my workaround is to manually delete the operator column and then re-execute the file manually:: https://github.com/apache/dolphinscheduler/blob/a5061eb3518fd9ea5db4a85892da61781baac04c/dolphinscheduler-dao/src/main/resources/sql/upgrade/3.2.2_schema/mysql/dolphinscheduler_ddl.sql

ahululu avatar Jul 24 '24 07:07 ahululu

Maybe the helm cluster 3.2.1 was upgraded to 3.2.2 once before, but there is a problem, So, I use helm rollback , but the database will not roll. Now my workaround is to manually delete the operator column and then re-execute the file manually:: https://github.com/apache/dolphinscheduler/blob/a5061eb3518fd9ea5db4a85892da61781baac04c/dolphinscheduler-dao/src/main/resources/sql/upgrade/3.2.2_schema/mysql/dolphinscheduler_ddl.sql

But I'm not sure that's the right thing to do

ahululu avatar Jul 24 '24 07:07 ahululu

Maybe the helm cluster 3.2.1 was upgraded to 3.2.2 once before, but there is a problem, So, I use helm rollback , but the database will not roll. Now my workaround is to manually delete the operator column and then re-execute the file manually:: https://github.com/apache/dolphinscheduler/blob/a5061eb3518fd9ea5db4a85892da61781baac04c/dolphinscheduler-dao/src/main/resources/sql/upgrade/3.2.2_schema/mysql/dolphinscheduler_ddl.sql

Yes. You can try this manully.

SbloodyS avatar Jul 24 '24 13:07 SbloodyS

After a few days of use after the upgrade, a new issue was discovered. api pod does not recognize alert pod with the following error message: Caused by: java.net.UnknownHostException: dolphinscheduler-alert-xxxxxx-xxx The temporary workaround for now is to add alert pod parsing to /etc/hosts . However, this operation is problematic, and if api pod restarts, the alert function will not be available again. Does anyone know how to fix it?

ahululu avatar Aug 05 '24 03:08 ahululu

cc @Gallardot

SbloodyS avatar Aug 05 '24 04:08 SbloodyS

After a few days of use after the upgrade, a new issue was discovered. api pod does not recognize alert pod with the following error message:

`

Caused by: java.net.UnknownHostException: dolphinscheduler-alert-xxxxxx-xxx

`

The temporary workaround for now is to add alert pod parsing to /etc/hosts .

However, this operation is problematic, and if api pod restarts, the alert function will not be available again. Does anyone know how to fix it?

I need more error logs

Gallardot avatar Aug 05 '24 23:08 Gallardot

After a few days of use after the upgrade, a new issue was discovered. api pod does not recognize alert pod with the following error message: Caused by: java.net.UnknownHostException: dolphinscheduler-alert-xxxxxx-xxx The temporary workaround for now is to add alert pod parsing to /etc/hosts . However, this operation is problematic, and if api pod restarts, the alert function will not be available again. Does anyone know how to fix it?

I need more error logs

It's the same problem as this issue: #16405

ahululu avatar Aug 13 '24 01:08 ahululu

This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.

github-actions[bot] avatar Sep 21 '24 00:09 github-actions[bot]

After a few days of use after the upgrade, a new issue was discovered. api pod does not recognize alert pod with the following error message: Caused by: java.net.UnknownHostException: dolphinscheduler-alert-xxxxxx-xxx The temporary workaround for now is to add alert pod parsing to /etc/hosts . However, this operation is problematic, and if api pod restarts, the alert function will not be available again. Does anyone know how to fix it?

I need more error logs

It's the same problem as this issue: #16405

I encountered same errors with ds v3.2.2, did you completely fix it ?

youzif avatar Sep 23 '24 07:09 youzif

After a few days of use after the upgrade, a new issue was discovered. api pod does not recognize alert pod with the following error message: Caused by: java.net.UnknownHostException: dolphinscheduler-alert-xxxxxx-xxx The temporary workaround for now is to add alert pod parsing to /etc/hosts . However, this operation is problematic, and if api pod restarts, the alert function will not be available again. Does anyone know how to fix it?

I need more error logs

It's the same problem as this issue: #16405

I encountered same errors with ds v3.2.2, did you completely fix it ? this issue will be fixed on v3.2.3 https://github.com/apache/dolphinscheduler/issues/16634#issuecomment-2362939746

youzif avatar Sep 23 '24 08:09 youzif

This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.

github-actions[bot] avatar Oct 26 '24 00:10 github-actions[bot]

This issue has been closed because it has not received response for too long time. You could reopen it if you encountered similar problems in the future.

github-actions[bot] avatar Nov 02 '24 00:11 github-actions[bot]