dify icon indicating copy to clipboard operation
dify copied to clipboard

Improvement: join primary key to unique constraint

Open lpdink opened this issue 1 year ago • 2 comments

Checklist:

[!IMPORTANT]
Please review the checklist below before submitting your pull request.

  • [x] Please open an issue before creating a PR or link to an existing issue
  • [x] I have performed a self-review of my own code
  • [x] I have commented my code, particularly in hard-to-understand areas
  • [x] I ran dev/reformat(backend) and cd web && npx lint-staged(frontend) to appease the lint gods

Description

This submission modifies the table definitions by adding the primary key id to all UniqueConstraint constraints, ensuring that table creation succeeds in a distributed database. This is because in a distributed database, the distribution key must be a subset of the unique keys in order to perform deduplication. If there are two unique keys, then the distribution key needs to be selected from the intersection of these two unique keys. This PR addresses the issue discussed in https://github.com/langgenius/dify/discussions/6720. However, this requires that all future unique declarations must include the primary key to ensure persistent effectiveness.

I do understand that this may introduce some burden, but I kindly ask the maintainers to consider this suggestion as it will help Dify support large-scale distributed deployments.

Type of Change

  • [ ] Bug fix (non-breaking change which fixes an issue)
  • [ ] New feature (non-breaking change which adds functionality)
  • [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • [ ] This change requires a documentation update, included: Dify Document
  • [x] Improvement, including but not limited to code refactoring, performance optimization, and UI/UX improvement
  • [ ] Dependency upgrade

Testing Instructions

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • [x] Verified that the updated ORM definitions no longer produce the unique constraint error in a distributed database environment.
  • [x] Checked that the primary key and unique constraints work as expected in single-node database

lpdink avatar Aug 09 '24 02:08 lpdink

@bowenliang123 Could you please kindly review this PR?

lpdink avatar Aug 09 '24 02:08 lpdink

cc @takatost

bowenliang123 avatar Aug 09 '24 02:08 bowenliang123

Thank you for your contribution. However, I’d like to point out that the proposed modification—adding the id field to existing unique constraints—could undermine the intended uniqueness of the original combinations, such as account_id + provider or any similar pair. Since id is always unique, the uniqueness constraint on these combinations would no longer prevent duplicate entries for the same values, as the uniqueness would be enforced only when the id differs.

+1.

bowenliang123 avatar Aug 12 '24 03:08 bowenliang123