django-waffle
django-waffle copied to clipboard
Waffle should remember a flag value for a given user
If a user is authenticated, it doesn't make sense to flip back and forth depending on their device.
I guess a ForeignKey
through a WaffleUser
model that stored a boolean is the best way to do that.
Would it be possible to not use a foreign key, just a uuid? That way if multiple databases are needed for a large application the ForeignKey will not fail.
It's certainly possible. I'm really not sure yet the right way to do this. The ForeignKey
is simple but I worry about it being slow. It's possible to use the existing Flag.users
attribute, but that would probably mean making a through
model with a boolean on/off value.
why not using the session ? sote in db by default, user sticky and can be moved in cache in the settings if needed (https://docs.djangoproject.com/en/1.8/topics/http/sessions/)
This is explicitly about persisting beyond sessions, e.g. if a user is signed in, switching devices shouldn't change the value of the flag.
As an alternative to persisting the bucketing for a given user/flag pair, would it help to make the bucketing deterministic on a given user? When Waffle draws randomly to bucket the user, we could provide a way to determine a randseed based on some unchanging user data with, say, a default implementation of using the user model pk as the seed. This way when the user discards their cookie (by logging into a new browser, profile, or device) they should still get consistently bucketed in future sessions.
This would not work if the Flag percentage changed over time.
This would not work if the Flag percentage changed over time.
Interesting idea! Unfortunately, this is a pretty hard requirement, I think. Every time I've rolled out a feature using Waffle, it's been to an increasing percentage over time (e.g. 5-10-30-100%), or with rollout-mode (where "on" is sticky but "off" is not).
Hm, agreed! Yeah...
Mostly an academic exercise, as I think the complexity of the implementation (or of the resulting rules) might not be justified, but an idea:
Say waffle keeps a history of percentages as you change them over time, so waffle is able to recall the percentage for a flag at a given past date. Then, in addition to providing the user-derived randseed (e.g. user.pk
), a deterministic per-user datetime is also provided (e.g. user.created
), and users are bucketed using the flag percentage at that date. Their original bucketing is then deterministic, and preserved with without having to persist that per-user information.
This would have its fair share of unintuitive behaviors; most notably, a site with a relatively static userbase would not be able to meaningfully change percentages after an initial flag rollout since the majority of user.created
dates precede the Flag percentage change date ranges.
I think our approach to this will effectively be the FK/boolean approach where we add another model to indicate whether someone has been bucketed or not, and use the Flag.users relation to track those users in the experiment. If anyone has already released something like this or has advice on edge cases to consider, love to hear it.
OK I actually have a path forward here, 7 years later. It's another way to implement @jasonm's core idea of making flag buckets deterministic.
- Let
p
be the percentage rollout expressed as an integer in[0, 1000]
. - Let
uid
be a numeric representation of the user (possibly stored in a cookie for unauthenticated users, and considering non-numeric primary keys on user models). - Let
key
be an integer associated with each flag, in some range[0, N]
whereN
is a multiple of 1000. These can be set randomly when flags are created (and for backwards compatibility could fall back to flags' autoincrement ID values) but should be user-editable. -
(uid + key) % 1000
is in the range[0, 999]
. - Replace the dice roll with
(uid + key) % 1000 < p
. We use strict inequality so that0 < p=0
isFalse
but999 < p=1000
isTrue
.
E.g. for a 10% roll out, users with (uid + key) % 1000 < 100
will be in the "on" group. If the percentage changes, no "on" users will change state, but a new set of users will become "on" (e.g. if the p
changes to 200, (uid + key) % 1000 < 200
will still include all the users for whom it's < 100
).
This has a few consequences:
- "rollout mode" is redundant and removable in 1.0—just up the percentage.
- all per-flag waffle cookies can be replaced with no cookies for authenticated users, and a single
uid
value for unauthenticated users. - if two flags share the same key, users will be bucketed the same way for both, this allows linking experiments.
- by randomizing
key
and adding it touid
, we should achieve a reasonably uniform distribution of users within the 1000 buckets, assuming a reasonably uniform set ofuid
s, i.e.: no user should consistently end up in the alpha group for all experiments.
Here's what I'm planning to do, simplified a little:
if request.user.is_authenticated:
if hasattr(request.user, 'get_waffle_id'):
uid = str(request.user.get_waffle_id())
else:
uid = str(request.user.pk)
else:
# check-and-set cookie
uid = request.COOKIES['waffleid'] # setting-controlled
ukey = zlib.crc32(uid.encode('utf-8'))
bucket = (ukey + flag.key) % 1000 # range [0, 999]
if bucket < (flag.percent * 10):
return True
return False
I tested crc32 with a million sequential integers (turned into strings) and with a million UUID4s and it's distribution was nearly perfect, in both cases it dropped a mean of 1000 users into each of 1000 buckets with a standard deviation of about 30. Good enough for me!
I might not do the get_waffle_id()
part. I was thinking about cases where the user primary key might not be string serializable, or where Waffle users might have some reason to avoid the primary key. 🤷♂️ I'll probably skip it for now and can always add it later if it comes up.
This is also going to break the set_flag
utility that some folks have asked for and (presumably) used. It won't be possible to set a flag for an individual user who isn't logged in. Hmm. I'm ok with that because supporting both is a lot of complexity and this feels a lot higher impact.