warehouse
warehouse copied to clipboard
Normalize and store email addresses
Email addresses are currently stored in a varchar(254) column in the database, with a non-null, and unique constraint.
However, as some email providers allow for extra characters in email addresses, and the column is not citext (case insensitive), we can have duplicate values.
Proposal:
- add a new empty
normalized_emailcitext column toEmailsmodel - populate the column with normalized values of each email address during email addition
- backfill existing records
This effort could be complemented by also adding a domain column to the table, and do the same work as normalization effort, to make the data representation very clear and unambiguous or reliant on string splitting.
This effort should be preceded by some queries on the table to determine how many of these we might expect to see.