boulder icon indicating copy to clipboard operation
boulder copied to clipboard

admin 's clearEmail misses email addresses

Open jcjones opened this issue 1 year ago • 2 comments

The code for admin-revoker to find registrations by email during a clear-email operation is too naive:

https://github.com/letsencrypt/boulder/blob/14a8378dd0dcf82e222aa3bccde47cc8f5ff1bc6/cmd/admin-revoker/main.go#L435-L444

We need to account for people adding + and . and such.

We use this code elsewhere to give the database a simple normalization mechanism, to find all the regIDs we're looking for:

def contact_field_from_email(email_address):
    return f"'mailto:{email_address}'"


def normal_where_clause(email_address):
    assert (
        "+" not in email_address
    ), "No tags ([email protected]) allowed in email address"

    return f"""
    LOWER(
        JSON_VALUE(
            CONCAT(
                # The user part with '+' and after removed
                REGEXP_REPLACE(SUBSTRING_INDEX(contact, '@', 1), '\\\\+.*', '')

                # The @ and the host part
                , SUBSTRING(contact, INSTR(contact, '@'))
            )
            , "$[0]"
        )
    ) = LOWER({contact_field_from_email(email_address)})
    """


def gmail_where_clause(email_address):
    assert (
        "+" not in email_address
    ), "No tags ([email protected]) allowed in email address"
    assert (
        "." not in email_address.split("@")[0]
    ), "No dots ([email protected]) allowed in gmail addresses"

    return f"""
    LOWER(
        JSON_VALUE(
            CONCAT(
                # The user part with '+' and after removed, and '.'s removed
                REPLACE(
                    REGEXP_REPLACE(SUBSTRING_INDEX(contact, '@', 1), '\\\\+.*', '')
                    , '.'
                    , ''
                )
                # The @ and the host part
                , SUBSTRING(contact, INSTR(contact, '@'))
            )
            , "$[0]"
        )
    ) = LOWER({contact_field_from_email(email_address)})
    """


def make_where_clause_for_email(email_address):
    if "gmail.com" in email_address or "googlemail.com" in email_address:
        return gmail_where_clause(email_address)
    return normal_where_clause(email_address)

It's still imperfect, but gmail is common and people do permutations of .s in their addresses... sometimes to trip up queries like this.

jcjones avatar Jan 31 '24 18:01 jcjones

We need to account for people adding + and . and such.

Do we actually need to support this? Those are different email addresses, that sometimes have delivery to the same mailbox, depending on mail provider. If folks want those other email addresses removed, they can submit requests for them, right?

aarongable avatar Jul 16 '24 17:07 aarongable

Other admin tools we use absolutely must. Perhaps clearEmail needn't, but we've generally tried to wipe out every permutation in the past because each back-and-forth with a "delete my account" requester takes real engineering time.

jcjones avatar Jul 16 '24 17:07 jcjones

We have announced that we will no longer be providing notification emails. Closing this ticket as WontFix, since we do not plan to make any changes to the email system prior to decommissioning it.

aarongable avatar Mar 06 '25 01:03 aarongable