normandy icon indicating copy to clipboard operation
normandy copied to clipboard

When validating recipe arguments, "name" may not be included when expected

Open mythmon opened this issue 6 years ago • 6 comments
trafficstars

https://sentry.prod.mozaws.net/operations/normandy-stage/issues/5562863/

KeyError: 'name'
(17 additional frame(s) were not displayed)
...
  File "rest_framework/views.py", line 492, in dispatch
    response = handler(request, *args, **kwargs)
  File "rest_framework/mixins.py", line 84, in partial_update
    return self.update(request, *args, **kwargs)
  File "rest_framework/mixins.py", line 70, in update
    self.perform_update(serializer)
  File "rest_framework/mixins.py", line 80, in perform_update
    serializer.save()
  File "rest_framework/serializers.py", line 209, in save
    self.instance = self.update(self.instance, validated_data)

KeyError: 'name'

mythmon avatar Apr 18 '19 18:04 mythmon

The recipe that triggered this is https://stage-admin.normandy.nonprod.cloudops.mozgcp.net/api/v3/recipe/741/

Notably, the field recipe.arguments['name'] is !@#*(), which is likely what is triggering this.

mythmon avatar Jun 27 '19 17:06 mythmon

I found the root cause of this. Recipe 741 was once a preference experiment, and is now an add-on experiment. The latest version of the recipe is an add-on experiment, but the latest approved version is a preference experiment.

The code that does does the argument.name uniqueness check confuses latest and latest approved. It gets all recipes where the latest revision is the right action type, but then access the names from the "current revision" which is the latest approved if the recipe has been approved.

In order to unblock QA, I've approved a newer version of the recipe that is an opt-out study. This fixes the bug for me on stage. We should also change the code to prevent this in the future.

mythmon avatar Jun 27 '19 19:06 mythmon

Is this related to changing action types? Should we disallow this entirely as far down as the model?

jaredlockhart avatar Aug 19 '19 17:08 jaredlockhart

Yes, was the problem. Specifically, the problem was changing to a different action type that had incompatible argument schemas. This combined with another problem where some code confuses Recipe.latest_revision and Recipe.approved_revision. Because of this it was seeing the recipe as both types of actions depending on what property it accessed. I think that it got the action type from one revision and the arguments from another revision.

This confusion is a problem that can hurt us in other ways. Disallowing action type changes would fix this case of the problem, but that confusion is still a problem.

mythmon avatar Aug 19 '19 21:08 mythmon

@mythmon / @jaredkerim Could you elaborate on what you see the fix for this issue to be? Is it enough to prevent recipes from changing action type, i.e. duplicate of #600? Or do you want someone to do an audit of the code path that led to some code consulting both latest_revision and some consulting approved_revision?

glasserc avatar Oct 08 '19 20:10 glasserc

There is definitely a problem related to confusion between latest_revision and approved_revision that this code exposed. In general, the properties on the Recipe class that tranparently pass through to some revision of that recipe probably make things hard to get right. This problem is definitely here. We happened to find symptoms of this problem when recipes changed actions.

In my opinion, we should probably remove the pass through properties on Recipe, and make all the code that access the recipes be more explicit about which revision it is using. That's a third option that you didn't mention. Of the two, I'd like to see an audit of the code paths that got confused.

Restricting recipes from changing action types is reasonable, but when it comes to this particular error, it is papering over an underlying problem.

mythmon avatar Oct 08 '19 23:10 mythmon