addons
addons copied to clipboard
Fix Userprofile.username charset db inconsistency
Running migration 1201 resulted in the error stderr: b"ERROR 1062 (23000) at line 2: Duplicate entry 'Austin' for key 'username'\n seemingly because we are dropping the existing explicit column definition of CHARACTER SET utf8mb4 COLLATE utf8mb4_bin and replacing it with the table default.
Attempts to manually fix the data by renaming the "Austin" username to something else, resulted in a different error for the next username, and so on (I abandoned after 3 UPDATE to different records). To unblock the stage deploy I merged https://github.com/mozilla/addons-server/commit/d6cdd59175b649f02e767843f921a882bd9957e8 and cherry-picked to the https://github.com/mozilla/addons-server/tree/2019.09.19-1 tag.
We need to fix the inconsistency and reapply ALTER TABLE `users` CHANGE COLUMN `username` `username` VARCHAR (255) NOT NULL; to bring the stage&prod database to django norms.
┆Issue is synchronized with this Jira Task
QA: Let's just test that this hasn't accidentally dropped existing usernames that were not dupes by loading known public collection URLs like https://addons-dev.allizom.org/en-US/firefox/collections/VictorQATester/testcollect/ and making sure they still work.
Actually, the new migration initially didn't take on stage, it failed with:
ERROR 1062 (23000) at line 7: Duplicate entry 'anonymous-83f2f3c3aefa9a786cee930e3572eba932' for key 'username'
The relevant SQL was this:
CREATE TEMPORARY TABLE `duplicate_usernames`
SELECT LOWER(`username`) AS `username` FROM `users`
GROUP BY LOWER(`username`) HAVING COUNT(*) > 1;
UPDATE `users`
SET `username`=CONCAT('anonymous-', MD5(RAND()), 32)
WHERE LOWER(`username`) != `username`
AND LOWER(`username`) IN (SELECT `username` FROM `duplicate_usernames`);
ALTER TABLE `users` CHANGE COLUMN `username` `username` VARCHAR (255) NOT NULL;
Before running it, there were ~80k affected users on stage, twice that on prod.
We tried several things to fix it, ended up with replacing the CONCAT('anonymous-', MD5(RAND()), 32) with CONCAT('anonymous-', LOWER(HEX(RANDOM_BYTES(16))) but then the migration still had not finished after 8+ hours.
We're going to remove the migration from the tag, and after that we need to figure out what took time (is it the UPDATE or the ALTER that follows it ?). Then we've got 2 options, assuming the UPDATE is what's problematic:
- Replace it with a different SQL that would be faster. I suspect the current version, while less likely to result in a duplicate entry than the previous one, might be exhausting entropy a little too quickly.
- Replace it with some python code that would change the problematic usernames little by little outside of the migration process
That would still leave the ALTER part to be done though. I need to check if that's what the last deploy was stuck on or if it was stuck on the UPDATE.
In any case, a 3rd option is to drop the column entirely. That'd still be a big ALTER, and preparing that change would take time, but ultimately the column is almost useless nowadays, we only need it to support legacy profile/collection URLs, and we could do that from a limited set of usernames in a different table. See https://github.com/mozilla/addons/issues/5528
@eviljeff mentioned on IRC a fourth option I had forgotten: we could set the collation we currently use in prod via a django migration, and forget about this whole thing for now.
(is it the
UPDATEor theALTERthat follows it ?)
It's the UPDATE, and it's now finished. We don't know how much more than 7 hours it took, it was run by @bqbn manually at my request last night, and I can see now on stage that it's done:
MySQL [addons_allizom_org]> SELECT COUNT(*) FROM (SELECT LOWER(username) FROM users GROUP BY LOWER(username) HAVING COUNT(*) > 1) as lol;
+----------+
| COUNT(*) |
+----------+
| 0 |
+----------+
Stage was never redeployed once it finished, so the ALTER has not been run on stage yet, and we don't know how much time that'd take.
Edit: it took 9 hours on stage.
summary of options:
- add column collation via manual django migration
- drop username column from model
- delete usernames that conflict under default collation.
And https://github.com/mozilla/addons/issues/6974 has the history of issues with option 3 (Tl;DR: we could go with the nuclear option and delete most non ascii usernames to solve the conflicts, or try a more subtle approach)
I'll look into option 1, see what it looks like, we've spent too much time on this already.
Just make a note here that the users table has been altered concerning the username column in -dev environment, but has NOT in -stage and -prod environments.
Do we have requests to URLs with usernames (instead of user id), and if so which ones, and how many ?
Let's look for requests matching these in the logs:
https://addons.mozilla.org/(\w{2,3}(?:-\w{2,6})?)/(firefox|android)/user/(.*?[^0-9].*)/.*https://addons.mozilla.org/(\w{2,3}(?:-\w{2,6})?)/(firefox|android)/collections/(.*?[^0-9].*)/.*https://addons.mozilla.org/api/(v\d)/accounts/account/(.*?[^0-9].*)/.*
About the regexps:
(\w{2,3}(?:-\w{2,6})?)matches locales,froren-US(firefox|android)matches applications(.*?[^0-9].*)matches usernames, which can contain anything, but can't be only digits(v\d)matches api versions
There are technically more possibilities with the API (?user=... in some endpoints) but I don't think we care about those
Old Jira Ticket: https://mozilla-hub.atlassian.net/browse/ADDSRV-18