skops Secure persistence: exclude (certain) built in functions

For secure persistence, we need to ensure that certain builtin functions cannot be loaded, such as eval, exec and functions from urllib, os, sys...

Sep 16 '22 16:09 BenjaminBossan

I think more that "cannot be loaded", we should only allow everything if the user says they're trusted, except a few categories of methods which we trust.

Sep 27 '22 06:09 adrinjalali

Yes, "cannot be loaded" unless the user says so. How granular do you want that to be, on a module level or should the user indicate individual functions?

Sep 27 '22 07:09 BenjaminBossan

That's something we'd need to figure out while working on the audit part. Since it doesn't make sense to load some functions and not load others, it's probably sufficient to fail, show risky sections, and ask the user to load the model with trusted=True or something.

We can also think of a way for users to allow certain modules/libraries/functions, but I'd say that has a lower priority.

Sep 27 '22 09:09 adrinjalali

Since it doesn't make sense to load some functions and not load others

Could you please elaborate on that? Do you mean that if a user would allow one builtin_function_or_method, they might as well allow all?

Sep 27 '22 11:09 BenjaminBossan

I mean, we're not going to do partial load of a model. So we either fail or pass. So from the user's perspective, I was thinking:

load(file): passes or fails, if it fails, we tell them why
load(file, trusted=True): passes and loads

We can however think of doing something like:

load(file, trusted_libs=...): if it fails, there are other reasons for it

But I'm a bit wary of that to start with, cause one can still do exploits using existing libraries. So at least as a start, I was thinking of brutally failing, and telling users why, and once we know more about potential exploits, we can allow more granular acceptance of modules/functions/etc?

Sep 29 '22 12:09 adrinjalali

We can however think of doing something like:
* load(file, trusted_libs=...): if it fails, there are other reasons for it

To me, that sounds very useful, I'd be much more inclined to trust a known, widely used library than something I've never heard of.

cause one can still do exploits using existing libraries

Then we should address those exploits. Maybe we should have an issue listing all known exploits?

Sep 29 '22 13:09 BenjaminBossan

I agree we should address the exploits, and that's why I started by fixing the one in joblib (merged and released). And I'm not saying we shouldn't support the API to trust particular libraries, I'm must saying we should probably wait to learn more before we do it.

Sep 29 '22 14:09 adrinjalali

And I'm not saying we shouldn't support the API to trust particular libraries

Sorry, I didn't mean to imply that, I just wanted to give my opinion from a user's perspective.

I agree we should address the exploits

Do we want to document them somewhere? Thinking about it, maybe a non-public place would be better.

Sep 29 '22 15:09 BenjaminBossan

Do we want to document them somewhere? Thinking about it, maybe a non-public place would be better.

Some of them don't need to be private, like removing __reduce__ from sklearn estimators. But for others, we can use "Security Advisories" on github: https://docs.github.com/en/code-security/repository-security-advisories/about-github-security-advisories-for-repositories. I haven't used this feature, but it seems it's exactly what it's for.

Sep 30 '22 08:09 adrinjalali

But for others, we can use "Security Advisories" on github

That seems to be almost overkill for now :D I just want a reminder, so that we don't forget about security issues we found.

Sep 30 '22 09:09 BenjaminBossan

We now raise on everything which is not trusted by default, and that includes insecure builtin functions. I think we can close this now.

Nov 29 '22 15:11 adrinjalali

skops skops copied to clipboard

Secure persistence: exclude (certain) built in functions

skops
skops copied to clipboard