strawberry icon indicating copy to clipboard operation
strawberry copied to clipboard

Fix: opentelemetry extension should handle non-primitive input parameters

Open omarzouk opened this issue 4 years ago β€’ 21 comments
trafficstars

json dump input parameter if its not a primitive

  • [X] Bugfix
  • [X] Enhancement/optimization

Issues Fixed or Closed by This PR

  • https://github.com/strawberry-graphql/strawberry/issues/1314

Checklist

  • [X] My code follows the code style of this project.
  • [ ] My change requires a change to the documentation.
  • [ ] I have updated the documentation accordingly.
  • [X] I have read the CONTRIBUTING document.
  • [ ] I have added tests to cover my changes.
  • [X] I have tested the changes and verified that they work and don't break anything (as well as I can manage).

omarzouk avatar Oct 06 '21 18:10 omarzouk

Thanks for adding the RELEASE.md file!

Here's a preview of the changelog:


This release improves how we log arguments in the OpenTelemetry extension, now values are ignored if they are None and are serialized as JSON if they are dictionaries.


Here's the preview release card for twitter:

Here's the tweet text:

πŸ†• Release (next) is out! Thanks to Omar Marzouk for the PR πŸ‘

Get it here πŸ‘‰ https://github.com/strawberry-graphql/strawberry/releases/tag/(next)

botberry avatar Oct 06 '21 18:10 botberry

Hello!

thanks for the quick response! You raise a very good point! However, we already do this for primitive data, and as it stands, this causes the tracer to print a warning message that just spams.

In a similar example to yours current code will leak a password if the input is just:

my_query(id:"abc", password:"123")

I agree with your assessment on the security risk. As such, do u think we should instead just remove tracing of inputs all together? I also agree that tracing is perhaps not the best place to have this? and users can always choose to add more tracing to their methods?

omarzouk avatar Oct 06 '21 19:10 omarzouk

another related point, as it is today, the queries are also part of the trace:

https://github.com/strawberry-graphql/strawberry/blob/main/strawberry/extensions/tracing/opentelemetry.py#L56

Meaning that if a user does not use query variables and embeds the data into the query body, it will also be leaked into the trace. But perhaps this is on the user since use of variables is always encouraged? πŸ€·β€β™‚οΈ

omarzouk avatar Oct 06 '21 19:10 omarzouk

In a similar example to yours current code will leak a password if the input is just:

my_query(id:"abc", password:"123")

However, in this case, you can filter password like this: https://strawberry.rocks/docs/extensions/opentelemetry#more-examples

The same solution could also be used to strike out a password in my prior example too I suppose.

I can absolutely see value in logging certain parameters, I think this should be a "what parameters to include" list rather than a filter list. I'll bring this up in our discord - consider joining the conversation: https://discord.gg/NsxZAWbX

ossareh avatar Oct 06 '21 20:10 ossareh

@omarzouk after discussion I agree with your point that your change doesn't make this less secure.

I'm not a core-dev; but after consideration I think your code would be fine to merge once you add a test to verify that it works and stays working.

The treatment of these fields, etc, needs to be given more consideration and can be done as a separate arc of work imho.

ossareh avatar Oct 06 '21 22:10 ossareh

Hello!

After discussion with some colleagues from the platform team at my company, they suggested that we "mask" all inputs (i.e. set them to be empty), and avoid writing the query all together.

They very much agree with your assessment of it being a security risk and advice strongly against it. It is not really the job of Tracing to debug requests. Additionally, one can pretty much understand the full extent of the execution if tracing is done in subsequent services as well, so you'd be able to know what happened. Also, a user could always log the trace_id along with more information to other logging stacks, but perhaps having it as a default in the underlying framework is too liberal of an approach.

What do u think?

omarzouk avatar Oct 07 '21 09:10 omarzouk

@omarzouk I feel like we're on the same page; there are these things to consider:

1 - tracing should log things which are helpful in performance issue diagnosis and treatment 2 - our integration shouldn't break when given an object 3 - our role of balancing utility with security

In your case, (1) can be handled by using filter_arg as documented here: https://strawberry.rocks/docs/extensions/opentelemetry#more-examples - for now you can just mask that input entirely.

Your PR solves (2) but just needs a small test case if you're willing to add that?

The third is what we're discussing on discord, which your PR brought up: thank you very much!

ossareh avatar Oct 07 '21 16:10 ossareh

@omarzouk would you be able to add a test case for this? 😊

patrick91 avatar Dec 07 '21 12:12 patrick91

@patrick91 I'm really sorry I've been incredibly busy the past months, yes I would be happy to add a test for it. I will try to do it next week

omarzouk avatar Dec 10 '21 00:12 omarzouk

Codecov Report

Merging #1315 (14c5c15) into main (6f2e120) will decrease coverage by 5.10%. The diff coverage is 100.00%.

:exclamation: Current head 14c5c15 differs from pull request most recent head 5a3025a. Consider uploading reports for the commit 5a3025a to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1315      +/-   ##
==========================================
- Coverage   98.19%   93.08%   -5.11%     
==========================================
  Files         162      165       +3     
  Lines        6492     6917     +425     
  Branches     1236     1343     +107     
==========================================
+ Hits         6375     6439      +64     
- Misses         58      418     +360     
- Partials       59       60       +1     

codecov[bot] avatar Feb 12 '22 23:02 codecov[bot]

@patrick91 Hi, sorry it took much more than a week πŸ˜… I got covid and then busy at work. However, I'm up and running again, and just pushed a test. Do u think it is enough?

omarzouk avatar Feb 12 '22 23:02 omarzouk

I have also learned that nulls should not be passed in to the set_attribute function as its behaviour is undefined, so I have added a line to skip arguments that are null https://github.com/open-telemetry/opentelemetry-specification/pull/992

omarzouk avatar Feb 12 '22 23:02 omarzouk

@patrick91 Hi! is this solution still relevant/ok? can we maybe move forward with it to get rid of the errors? or what do u think?

omarzouk avatar Jun 27 '22 11:06 omarzouk

@omarzouk would you mind adding a file RELEASE.md to your PR so that the CI pipeline passes? An instruction is in the first comment.

I would love this to get merged as I have quite many warnings of form Invalid type NoneType for attribute value. Expected one of ['bool', 'str', 'bytes', 'int', 'float'] or a sequence of those types.

adambabik avatar Sep 15 '22 13:09 adambabik

/pre-release

patrick91 avatar Sep 15 '22 16:09 patrick91

Pre-release

:wave:

Pre-release 0.131.1.dev.1663258832 [5a3025af3807c6e50b6e89f77c7dc0fd7614987a] has been released on PyPi! :rocket: You can try it by doing:

poetry add strawberry-graphql==0.131.1.dev.1663258832

botberry avatar Sep 15 '22 16:09 botberry

/pre-release

patrick91 avatar Sep 15 '22 16:09 patrick91

@adambabik would you be ok with trying this pre-release? strawberry-graphql==0.131.1.dev.1663258832 if that works for you I can merge the PR 😊

patrick91 avatar Sep 15 '22 16:09 patrick91

@patrick91 thanks! I took a closer look and tried the patch but it may result in an exception when json.dumps() fails.

To merge this PR, at the very least, we need to wrap the json.dumps() call:

try:
    v = json.dumps(value)
    span.set_attribute(f"graphql.param.{kwarg}", v)
except (TypeError, OverflowError):
    pass

With this change, we will start seeing more attributes but not all of them and we still won't have control over serialization. Having said that and also looking at the previous discussion about leaking sensitive data, I wonder if the best strategy wouldn't be to actually deny any params by default. By using arg_filter one can add params they actually need. We could also introduce arg_serializer to allow handling complex data types.

Alternatively, if we want to can keep adding params by default, we can still add arg_serializer and possibly provide a more robust default than json.dumps(). For example, put dict keys into attribute's name: f"graphql.param.{kwarg}.{key_level_1}.{key_level_2}". More about attribute's name can be read in the OpenTelemetry spec.

adambabik avatar Sep 16 '22 09:09 adambabik

@patrick91 thanks! I took a closer look and tried the patch but it may result in an exception when json.dumps() fails.

To merge this PR, at the very least, we need to wrap the json.dumps() call:

try:
    v = json.dumps(value)
    span.set_attribute(f"graphql.param.{kwarg}", v)
except (TypeError, OverflowError):
    pass

With this change, we will start seeing more attributes but not all of them and we still won't have control over serialization. Having said that and also looking at the previous discussion about leaking sensitive data, I wonder if the best strategy wouldn't be to actually deny any params by default. By using arg_filter one can add params they actually need. We could also introduce arg_serializer to allow handling complex data types.

Alternatively, if we want to can keep adding params by default, we can still add arg_serializer and possibly provide a more robust default than json.dumps(). For example, put dict keys into attribute's name: f"graphql.param.{kwarg}.{key_level_1}.{key_level_2}". More about attribute's name can be read in the OpenTelemetry spec.

I think this is a better approach, I’m not sure if I’ll have time this week to make updates, would you be able to make a PR for it? ☺️

patrick91 avatar Sep 16 '22 11:09 patrick91

@patrick91 sure, I should be able to submit a PR early next week :)

adambabik avatar Sep 16 '22 11:09 adambabik