studio icon indicating copy to clipboard operation
studio copied to clipboard

Channels should use a UUID field for their primary key

Open rtibbles opened this issue 6 months ago • 0 comments

Desired behavior

Channel primary keys should be UUID types

Current behavior

Channel primary keys are Char types, with a small amount of validation.

However, we have seen instances where non-UUID values have been stored in the database.

Value add

Having database level typing would prevent illegal data from being written.

It is also highly likely that using UUID fields will result in a more compact data representation in the DB table, and possibly improved query performance when querying by primary key (needs validation).

Possible tradeoffs

Doing this will require data migration, and doing so with zero downtime/disruption will require a phased approach.

The basic schema for handling this is described here: https://medium.com/@pranavdixit20/zero-downtime-migrations-in-django-the-advanced-playbook-742d85f1f5c0

However, there are additional considerations for this specific change:

  • This is the primary key for the channel table, swapping this may still involve some downtime
  • Foreign key relationships to the primary key will also need to be handled during the swapover.

This article seems to shed some light on how to handle this: https://engineering.silverfin.com/pg-zero-downtime-bigint-migration/

So, to handle this, we would:

Add a new_id field that is unique, and initially nullable. Whenever a new channel is created, the id is written to the id field and this field.

Create new relation fields for models that have relations to Channel that use the to_field argument and have them point at the new field, not at the primary key. Writes to the original relations should also set this relation.

Write data migration tasks that migrate all existing models to ensure this duplication is total.

DEPLOY.

Make the new id field non-nullable. Create a unique index over the new id field (concurrently).

DEPLOY.

Update all application code to point to the new fields instead of the old ones. Drop the old relations.

DEPLOY.

Switch the primary key field to the new field. Drop the old PK field. It is possible that this

DEPLOY.

Then it should be done.

This plan might be refinable to avoid having to end up with a renamed PK field, and also to avoid having to rename the relations.

rtibbles avatar May 29 '25 20:05 rtibbles