django-waffle icon indicating copy to clipboard operation
django-waffle copied to clipboard

Waffle should remember a flag value for a given user

Open jsocol opened this issue 14 years ago • 11 comments

If a user is authenticated, it doesn't make sense to flip back and forth depending on their device.

I guess a ForeignKey through a WaffleUser model that stored a boolean is the best way to do that.

jsocol avatar Jan 27 '11 02:01 jsocol

Would it be possible to not use a foreign key, just a uuid? That way if multiple databases are needed for a large application the ForeignKey will not fail.

garrypolley avatar Jan 04 '12 00:01 garrypolley

It's certainly possible. I'm really not sure yet the right way to do this. The ForeignKey is simple but I worry about it being slow. It's possible to use the existing Flag.users attribute, but that would probably mean making a through model with a boolean on/off value.

jsocol avatar Jan 09 '12 15:01 jsocol

why not using the session ? sote in db by default, user sticky and can be moved in cache in the settings if needed (https://docs.djangoproject.com/en/1.8/topics/http/sessions/)

iXioN avatar Apr 23 '15 09:04 iXioN

This is explicitly about persisting beyond sessions, e.g. if a user is signed in, switching devices shouldn't change the value of the flag.

jsocol avatar Apr 25 '15 15:04 jsocol

As an alternative to persisting the bucketing for a given user/flag pair, would it help to make the bucketing deterministic on a given user? When Waffle draws randomly to bucket the user, we could provide a way to determine a randseed based on some unchanging user data with, say, a default implementation of using the user model pk as the seed. This way when the user discards their cookie (by logging into a new browser, profile, or device) they should still get consistently bucketed in future sessions.

This would not work if the Flag percentage changed over time.

jasonm avatar Oct 14 '16 22:10 jasonm

This would not work if the Flag percentage changed over time.

Interesting idea! Unfortunately, this is a pretty hard requirement, I think. Every time I've rolled out a feature using Waffle, it's been to an increasing percentage over time (e.g. 5-10-30-100%), or with rollout-mode (where "on" is sticky but "off" is not).

jsocol avatar Oct 15 '16 17:10 jsocol

Hm, agreed! Yeah...

Mostly an academic exercise, as I think the complexity of the implementation (or of the resulting rules) might not be justified, but an idea:

Say waffle keeps a history of percentages as you change them over time, so waffle is able to recall the percentage for a flag at a given past date. Then, in addition to providing the user-derived randseed (e.g. user.pk), a deterministic per-user datetime is also provided (e.g. user.created), and users are bucketed using the flag percentage at that date. Their original bucketing is then deterministic, and preserved with without having to persist that per-user information.

This would have its fair share of unintuitive behaviors; most notably, a site with a relatively static userbase would not be able to meaningfully change percentages after an initial flag rollout since the majority of user.created dates precede the Flag percentage change date ranges.

I think our approach to this will effectively be the FK/boolean approach where we add another model to indicate whether someone has been bucketed or not, and use the Flag.users relation to track those users in the experiment. If anyone has already released something like this or has advice on edge cases to consider, love to hear it.

jasonm avatar Oct 16 '16 20:10 jasonm

OK I actually have a path forward here, 7 years later. It's another way to implement @jasonm's core idea of making flag buckets deterministic.

  • Let p be the percentage rollout expressed as an integer in [0, 1000].
  • Let uid be a numeric representation of the user (possibly stored in a cookie for unauthenticated users, and considering non-numeric primary keys on user models).
  • Let key be an integer associated with each flag, in some range [0, N] where N is a multiple of 1000. These can be set randomly when flags are created (and for backwards compatibility could fall back to flags' autoincrement ID values) but should be user-editable.
  • (uid + key) % 1000 is in the range [0, 999].
  • Replace the dice roll with (uid + key) % 1000 < p. We use strict inequality so that 0 < p=0 is False but 999 < p=1000 is True.

E.g. for a 10% roll out, users with (uid + key) % 1000 < 100 will be in the "on" group. If the percentage changes, no "on" users will change state, but a new set of users will become "on" (e.g. if the p changes to 200, (uid + key) % 1000 < 200 will still include all the users for whom it's < 100).

This has a few consequences:

  • "rollout mode" is redundant and removable in 1.0—just up the percentage.
  • all per-flag waffle cookies can be replaced with no cookies for authenticated users, and a single uid value for unauthenticated users.
  • if two flags share the same key, users will be bucketed the same way for both, this allows linking experiments.
  • by randomizing key and adding it to uid, we should achieve a reasonably uniform distribution of users within the 1000 buckets, assuming a reasonably uniform set of uids, i.e.: no user should consistently end up in the alpha group for all experiments.

jsocol avatar Mar 07 '18 18:03 jsocol

Here's what I'm planning to do, simplified a little:

if request.user.is_authenticated:
    if hasattr(request.user, 'get_waffle_id'):
        uid = str(request.user.get_waffle_id())
    else:
        uid = str(request.user.pk)
else:
    # check-and-set cookie
    uid = request.COOKIES['waffleid']  # setting-controlled

ukey = zlib.crc32(uid.encode('utf-8'))
bucket = (ukey + flag.key) % 1000  # range [0, 999]
if bucket < (flag.percent * 10):
    return True
return False

I tested crc32 with a million sequential integers (turned into strings) and with a million UUID4s and it's distribution was nearly perfect, in both cases it dropped a mean of 1000 users into each of 1000 buckets with a standard deviation of about 30. Good enough for me!

jsocol avatar Mar 21 '18 14:03 jsocol

I might not do the get_waffle_id() part. I was thinking about cases where the user primary key might not be string serializable, or where Waffle users might have some reason to avoid the primary key. 🤷‍♂️ I'll probably skip it for now and can always add it later if it comes up.

jsocol avatar Mar 21 '18 14:03 jsocol

This is also going to break the set_flag utility that some folks have asked for and (presumably) used. It won't be possible to set a flag for an individual user who isn't logged in. Hmm. I'm ok with that because supporting both is a lot of complexity and this feels a lot higher impact.

jsocol avatar Mar 21 '18 14:03 jsocol