skops icon indicating copy to clipboard operation
skops copied to clipboard

Secure persistence: exclude (certain) built in functions

Open BenjaminBossan opened this issue 3 years ago • 10 comments

For secure persistence, we need to ensure that certain builtin functions cannot be loaded, such as eval, exec and functions from urllib, os, sys...

BenjaminBossan avatar Sep 16 '22 16:09 BenjaminBossan

I think more that "cannot be loaded", we should only allow everything if the user says they're trusted, except a few categories of methods which we trust.

adrinjalali avatar Sep 27 '22 06:09 adrinjalali

Yes, "cannot be loaded" unless the user says so. How granular do you want that to be, on a module level or should the user indicate individual functions?

BenjaminBossan avatar Sep 27 '22 07:09 BenjaminBossan

That's something we'd need to figure out while working on the audit part. Since it doesn't make sense to load some functions and not load others, it's probably sufficient to fail, show risky sections, and ask the user to load the model with trusted=True or something.

We can also think of a way for users to allow certain modules/libraries/functions, but I'd say that has a lower priority.

adrinjalali avatar Sep 27 '22 09:09 adrinjalali

Since it doesn't make sense to load some functions and not load others

Could you please elaborate on that? Do you mean that if a user would allow one builtin_function_or_method, they might as well allow all?

BenjaminBossan avatar Sep 27 '22 11:09 BenjaminBossan

I mean, we're not going to do partial load of a model. So we either fail or pass. So from the user's perspective, I was thinking:

  • load(file): passes or fails, if it fails, we tell them why
  • load(file, trusted=True): passes and loads

We can however think of doing something like:

  • load(file, trusted_libs=...): if it fails, there are other reasons for it

But I'm a bit wary of that to start with, cause one can still do exploits using existing libraries. So at least as a start, I was thinking of brutally failing, and telling users why, and once we know more about potential exploits, we can allow more granular acceptance of modules/functions/etc?

adrinjalali avatar Sep 29 '22 12:09 adrinjalali

We can however think of doing something like:

* load(file, trusted_libs=...): if it fails, there are other reasons for it

To me, that sounds very useful, I'd be much more inclined to trust a known, widely used library than something I've never heard of.

cause one can still do exploits using existing libraries

Then we should address those exploits. Maybe we should have an issue listing all known exploits?

BenjaminBossan avatar Sep 29 '22 13:09 BenjaminBossan

I agree we should address the exploits, and that's why I started by fixing the one in joblib (merged and released). And I'm not saying we shouldn't support the API to trust particular libraries, I'm must saying we should probably wait to learn more before we do it.

adrinjalali avatar Sep 29 '22 14:09 adrinjalali

And I'm not saying we shouldn't support the API to trust particular libraries

Sorry, I didn't mean to imply that, I just wanted to give my opinion from a user's perspective.

I agree we should address the exploits

Do we want to document them somewhere? Thinking about it, maybe a non-public place would be better.

BenjaminBossan avatar Sep 29 '22 15:09 BenjaminBossan

Do we want to document them somewhere? Thinking about it, maybe a non-public place would be better.

Some of them don't need to be private, like removing __reduce__ from sklearn estimators. But for others, we can use "Security Advisories" on github: https://docs.github.com/en/code-security/repository-security-advisories/about-github-security-advisories-for-repositories. I haven't used this feature, but it seems it's exactly what it's for.

adrinjalali avatar Sep 30 '22 08:09 adrinjalali

But for others, we can use "Security Advisories" on github

That seems to be almost overkill for now :D I just want a reminder, so that we don't forget about security issues we found.

BenjaminBossan avatar Sep 30 '22 09:09 BenjaminBossan

We now raise on everything which is not trusted by default, and that includes insecure builtin functions. I think we can close this now.

adrinjalali avatar Nov 29 '22 15:11 adrinjalali