django-redis
django-redis copied to clipboard
Security concerns about pickle for serialization
Hi!
Before you hit the close button right away, please hear me out:
I'm aware of #161 but it's at least in part about switching to JSON in particular and I think back then the msgpack serializer was not yet around...
I'm opening this ticket because using pickle means that if someone gets control over Redis that gets them arbitrary code execution in django-redis for free, just because someone didn't think or know of changing the default serializer from pickle to something else. I think it's just too crazy of a default — let's use anything but pickle for a default, please. :pray:
Thanks for your reconsideration!
Django's built-in cache functionality fails if Pickle isn't available - e,g; @cache_page doesn't work with JSON serialization. It would probably make more sense for Django to provide a JSON serializer.
Django's built-in cache functionality fails if Pickle isn't available - e,g;
@cache_pagedoesn't work with JSON serialization.
If so, maybe django-redis could have special handling to make storage and retrieval of response work.
It would probably make more sense for Django to provide a JSON serializer.
I agree that Django itself should probably also get off pickle but I don't see how it would fix the problem for django-redis for free. Am I missing something?
I agree that Django itself should probably also get off pickle but I don't see how it would fix the problem for django-redis for free. Am I missing something?
The thought is a Django-specific JSON serializer could serialize/deserialize all the cached object types that are used in Django, but I'm not really sure how feasible that would actually be - you'll find 3rd party libs use pickled cache objects too.
The fact that Django itself uses pickle on its storage backends suggests that the problem should probably addressed in Django if at all. It would make sense for someone to propose that backends accept an object that provides serialization and deserialization that would be called when either of the operations need to be performed. By default that would be pickle unless they wanted to change their default themselves, but I find that unlikely. What that would buy is that the configuration setting would be promoted to something that all backends could support and a consistent experience.
This package provides such a hook: https://github.com/jazzband/django-redis#pluggable-serializer
If you can't trust your redis server you may have bigger issues, but I respect that one would want to load data without having to worry about arbitrary execution. In that case I could see someone writing a signed serializer that stores an additional signature with the data and only if the data has been signed does it load the data from pickle. This would likely be the same as how rope handled the issue https://github.com/python-rope/rope/pull/251. If this is really a concern that's something Django should package itself and that might be accepted as a different default serializer.
Hi @terencehonles,
thanks for your reply. I think back when I opened this ticket in 2019 I did not have signature guards on the radar yet. I agree that combining pickle with a signature guard could make a great solution in general.
Signature extraction and checking would either need to happen (a) before any serializer (and hence protect and make slower not just the pickle backend) or (b) inside pickle serialziers, close to pickle, for pickle only, where protection is needed most — right? Both approaches seem reasonable, depending on threat model. settings.SECRET_KEY seems to be a natural candidate to use to generate signatures.
I had a closer look at core Django's cache backends just now and I'm a bit surprised how not orthogonal and not easy to override the use of pickle (and zlib) is in
Wouldn't it be awesome to just override a method an replace (zlib by something faster and) pickling by pickling with authentication, rather than forking the whole backend class and tracking their changes to stay up to date with bugfixes?
So if we step back to ecosystem view: if the community and Django core wants approach (a) then there is one place to fix things — in core Django — and that's it. If the community wants appraoch (b) then django-redis could maybe offer two copies of PickleSerializer — SecurePickleSerializer and InsecurePickleSerializer — one doing signature checks and being default and the other not doing that and having people buy into their threat model, consciously. If a cache is a true cache in the sense of "data that can fully be regenerated" then I wouldn't even worry much about not being able to decode existing cache entries on software upgrades, because it's "just cached data".
@terencehonles, what do you think?
I agree with the general assessment and surprise. I'm not in the same position regarding pickle since it has a lot of utility, but checking the signed data wouldn't be that slow and it would still allow serializing anything that was previously allowed (pickle does have it's constraints, but that's higher than JSON / msgpack at the expense of security).
I would prefer if this was vetted against the Django project and in that case this project would offer neither. It would consume implementations from the parent project.
I would prefer if this was vetted against the Django project and in that case this project would offer neither. It would consume implementations from the parent project.
Do you have the resources to bring it up in Django, provide a pull request, and go through the review-fixing loop with them? I'm afraid I don't.
I don't believe I do, or at least not at the moment, and this is not a high priority problem for me because where I am running redis it is in my full control and a take over of redis would likely provide bigger issues elsewhere.
If you don't have the time to bring it up with Django then this likely is a "won't fix", but I will leave this issue open in case your situation changes or someone else wants to champion the cause.
drf Response is not ok while I set "SERIALIZER": "django_redis.serializers.json.JSONSerializer", and use @method_decorator(cache_page(30)) just like drf doc
TypeError: Object of type 'Response' is not JSON serializable
change to pickle serializer is ok