patchman icon indicating copy to clipboard operation
patchman copied to clipboard

Fix get or create module

Open mauricesmiley opened this issue 3 weeks ago • 2 comments

Fix duplicate insert error in get_or_create_module and improve MySQL compatibility

We encountered a reproducible error when running patchman -a on SQLite:

UNIQUE constraint failed: modules_module.name, modules_module.stream, modules_module.version, modules_module.context, modules_module.arch_id

Root cause:

  • get_or_create_module() included repo in the uniqueness filter, causing duplicate insert attempts when multiple repos contained the same module metadata.

Fix:

  • Removed repo from uniqueness filter
  • Added defaults={'repo': repo} so repo is set only on creation
  • Added IntegrityError handling to fetch existing row instead of crashing

Additionally, MySQL migrations failed due to oversized indexes (utf8mb4 key length > 3072 bytes). We reduced CharField lengths and simplified unique_together in modules/models.py.

Impact:

  • patchman -a runs cleanly on SQLite
  • Migrations apply cleanly on MySQL
  • Patchman now works consistently across both backends

Diff summary: modules/utils.py | 24 insertions(+), 35 deletions(-) modules/models.py | 8 changes (field length adjustments, unique_together simplified)


Verification steps:

  1. Run patchman -a on SQLite → completes without UNIQUE constraint error.
  2. Run python manage.py migrate on MySQL (utf8mb4) → schema builds cleanly.
  3. Verify modules deduplicate correctly across multiple repos.

mauricesmiley avatar Dec 11 '25 23:12 mauricesmiley

My local uncommitted fix for this was

$ git diff modules/
diff --git a/modules/models.py b/modules/models.py
index 931a41c..e1f8071 100644
--- a/modules/models.py
+++ b/modules/models.py
@@ -35,7 +35,7 @@ class Module(models.Model):
     class Meta:
         verbose_name = 'Module'
         verbose_name_plural = 'Modules'
-        unique_together = ['name', 'stream', 'version', 'context', 'arch']
+        unique_together = ['name', 'stream', 'version', 'context', 'arch', 'repo']
         ordering = ['name', 'stream']
 
     def __str__(self):
diff --git a/modules/utils.py b/modules/utils.py
index 0d66947..248b8b4 100644
--- a/modules/utils.py
+++ b/modules/utils.py
@@ -25,10 +25,9 @@ def get_or_create_module(name, stream, version, context, arch, repo):
     """ Get or create a module object
         Returns the module
     """
-    created = False
-    m_arch, c = PackageArchitecture.objects.get_or_create(name=arch)
+    m_arch, _ = PackageArchitecture.objects.get_or_create(name=arch)
     try:
-        module, created = Module.objects.get_or_create(
+        module, _ = Module.objects.get_or_create(
             name=name,
             stream=stream,
             version=version,

Can you see if that works for you?

I thought I fixed the mysql issue elsewhere, let me see if I can find it.

furlongm avatar Dec 11 '25 23:12 furlongm

📋 Background and Fix History (further context)

  • Initial problem (SQLite3):
    The original get_or_create_module() treated repo as part of the uniqueness filter. On SQLite, this caused IntegrityError crashes because the schema only enforced uniqueness on (name, stream, version, context). The retry logic didn’t find the existing row, so refreshes failed.

  • First fix (SQLite3):
    We moved repo into defaults={'repo': repo} and retried lookup only on the true unique fields. That resolved the IntegrityError loop on SQLite — refreshes worked cleanly.

  • Migration to MySQL (for performance):
    After switching to MySQL, further issues appeared because the models.py constraint (unique_together) didn’t match the suggested diff. MySQL was still enforcing uniqueness only on (name, stream, version, context), so adding repo into unique_together and into the filter didn’t align with the actual DB schema. This mismatch caused IntegrityErrors and DoesNotExist errors.

  • Testing the suggested fix:
    I did apply the diff exactly as written (adding repo to unique_together and passing repo into get_or_create). Unfortunately it still failed with reproducible errors on MySQL:

    django.db.utils.IntegrityError: (1062, "Duplicate entry ... for key 'modules_module.modules_module_name_stream_version_context_eaaac8eb_uniq'")
    ...
    modules.models.Module.DoesNotExist: Module matching query does not exist.
    

    So the suggested fix did not resolve the crash in practice.

  • Final fix (SQLite + MySQL):
    By keeping the schema as‑is and using defaults={'repo': repo} in get_or_create(), with retry lookup only on the true unique fields, we now have a backend‑agnostic solution. It works cleanly on both SQLite and MySQL, avoids duplicate‑key crashes, and still records repo when a new module is created.


✅ Summary

The problem started on SQLite, the fix solved it there, and migration to MySQL exposed the schema mismatch. We tested the suggested diff, but it failed with IntegrityErrors. The final patch matches the real DB constraint and works consistently across both backends. Verified cleanly on SQLite and MySQL.

mauricesmiley avatar Dec 12 '25 19:12 mauricesmiley