fickling icon indicating copy to clipboard operation
fickling copied to clipboard

Feat: add vetted properties

Open McPatate opened this issue 4 years ago • 9 comments

As mentioned in #14, here is a first attempt at implementing an option for us to allow some library calls.

For the moment, this is a very manual process in the sense that we have to create a list of modules and the function calls that are ok by hand.

If you have any idea on improving this, your contribution is more than welcome !

McPatate avatar Feb 18 '22 16:02 McPatate

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar Feb 18 '22 16:02 CLAassistant

I'm not sure if feasibly we can vet functions/methods in the libraries. I think vetting the library would be enough? We could have unsafe parts of a library, but I'm not sure if we want to include all methods and classes from the libraries we trust here.

I understand this is cumbersome. I'm not sure how to link a function call to its module. I'm currently looking in the ASTProperties object of Pickled which contains imports and calls as two separate lists. It may be possible by parsing the AST, I have to dig deeper.

Also, I think another useful thing would be to be able to get a list of those libraries which are used/imported by the pickle file, for downstream analysis.

To achieve this :

from fickling.pickle import Pickled
import pickle

pkl = b'cbuiltins\neval\n(Vprint("hello")\ntR.'
p = Pickled.load(pkl)
imports = [imp.module for imp in p.properties.imports]
print(imports)

Do you think it's worth creating a function that returns imports ? Or just adding it to the cli suffices for now (imo should be the matter of another PR) ?

McPatate avatar Feb 22 '22 13:02 McPatate

Do you think it's worth creating a function that returns imports ? Or just adding it to the cli suffices for now (imo should be the matter of another PR) ?

Fair enough, no need to add it.

adrinjalali avatar Feb 24 '22 09:02 adrinjalali

I'm not 100% convinced yet that any imports are safe. For example, numpy is not, in general. At a minimum, I think fickling should print out a warning if any vetted imports or calls are provided. Something like, "Warning: This pickle file imports external module [X] which was added to the vetted module list."

I'd be happy with that.

numpy itself is not safe. For example: https://numpy.org/doc/stable/reference/generated/numpy.distutils.exec_command.html

How are vetted_dependencies and vetted_calls intended to interact? Are ndarray and dtype the only calls within numpy that are allowed? This should be documented somewhere in the README. I'm happy to help with that.

Fair point. But I'm not sure if vetting is in the scope of fickling. I was thinking of fickling just to allow people to add allowed modules/methods. We could probably have a separate library to have suggestions for vetted sections of different libraries and hope people would contribute to them?

adrinjalali avatar Feb 28 '22 16:02 adrinjalali

As @adrinjalali said, the idea is indeed to let the user add a list of libraries and function calls within it as "vetted".

I also do not think fickling should vet any particular library, this should be at the discretion of users (though you could have a "community vetted librairies" example or document, but that doesn't seem safe to me).

The only problem I see with the current implementation is that there is no link between function calls and the module they're from, e.g. dtype could come from another imported module and not be safe.

Are ndarray and dtype the only calls within numpy that are allowed?

That was my idea, yes. I think this way you avoid allowing numpy calls that execute arbitrary code as you showed.

McPatate avatar Mar 01 '22 10:03 McPatate

@ESultanik, @adrinjalali

I think this is better to flag authorized imports and their respective calls. That being said, is_likely_safe will still return False for the moment. I need to figure out a way to make sure that when we have ast.Call objects, that the actual function call is for an imported function (that can be checked with has_unvetted_dependencies) and not some user defined function (which I think is unsafe ?).

McPatate avatar Mar 07 '22 14:03 McPatate

I'm not 100% convinced yet that any imports are safe. For example, numpy is not, in general.

I think it's even worse than that:

import numpy as np
np.sys.modules["os"].system("echo hello")
getattr(locals()["np"], "sys").modules["os"].system("echo hello") # In case you imagined that you could saveguard in anyform this.

Any module that ever import os or subprocess is not safe to import.

Narsil avatar Mar 07 '22 16:03 Narsil

I think I may have misunderstood this PR the first time I considered it. I'm willing to reconsider adding an --allow-list of imports to ignore, as long as there are none enabled by default. I am going to extend this branch to enable their specification via the command line.

ESultanik avatar Sep 07 '22 15:09 ESultanik

Nice ! Feel free to refactor everything if it doesn't behave in a way that you find appropriate.

This was mostly an example of what we could do !

McPatate avatar Sep 07 '22 16:09 McPatate