amoro icon indicating copy to clipboard operation
amoro copied to clipboard

[Bug]: Duplicate entry for key 'PRIMARY' (processId)

Open 7hong opened this issue 1 year ago • 3 comments

What happened?

Primary key conflicts occur, especially when starting parallelized planning。

processId from:

https://github.com/apache/amoro/blob/4e7fc9b0eb35a6b8768e9ab79a99efed9622b0ea/amoro-ams/src/main/java/org/apache/amoro/server/optimizing/plan/OptimizingPlanner.java#L80-L83

Affects Versions

0.7.0

What table formats are you seeing the problem on?

Iceberg

What engines are you seeing the problem on?

Optimizer

How to reproduce

When starting parallel planning, the probability of occurrence is higher

Relevant log output

2024-10-14 00:19:45,733 ERROR [plan-executor-thread-499] [org.apache.amoro.server.optimizing.OptimizingQueue] [] - Planning table xx.xx.xx(tableId=30058) failed
org.apache.amoro.server.exception.PersistenceException: org.apache.ibatis.exceptions.PersistenceException:
### Error updating database.  Cause: java.sql.SQLIntegrityConstraintViolationException: Duplicate entry '1728836383424' for key 'PRIMARY'
### The error may exist in org/apache/amoro/server/persistence/mapper/OptimizingMapper.java (best guess)
### The error may involve org.apache.amoro.server.persistence.mapper.OptimizingMapper.insertOptimizingProcess-Inline
### The error occurred while setting parameters
### SQL: INSERT INTO table_optimizing_process(table_id, catalog_name, db_name, table_name ,process_id, target_snapshot_id, target_change_snapshot_id, status, optimizing_type, plan_time, summary, from_sequence, to_sequence) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
### Cause: java.sql.SQLIntegrityConstraintViolationException: Duplicate entry '1728836383424' for key 'PRIMARY'

Anything else

No response

Are you willing to submit a PR?

  • [ ] Yes I am willing to submit a PR!

Code of Conduct

  • [X] I agree to follow this project's Code of Conduct

7hong avatar Oct 14 '24 03:10 7hong

This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.

github-actions[bot] avatar Apr 13 '25 00:04 github-actions[bot]

@Jzjsnow Could you please take a look at this, thanks

klion26 avatar Apr 14 '25 01:04 klion26

@Jzjsnow Could you please take a look at this, thanks

Well, in the latest commit, I add table_id to the primary keys of tables table_optimizing_process, task_runtime, and optimizing_task_quota respectively. This ensures the uniqueness of the primary key in case of concurrent multi-table optimization.

Jzjsnow avatar Apr 15 '25 13:04 Jzjsnow