alibi icon indicating copy to clipboard operation
alibi copied to clipboard

Validate explain parameters

Open sakoush opened this issue 3 years ago • 5 comments

We might also do some validation of the explain parameters that are being passed from config. currently I guess the explainer will throw an error but perhaps we could have some simple validation that the parameters are valid type and data?

I was thinking that this could be a class method on each explainer? ie. something like AnchorImage.validate_parameters? This can be called early in the process without having to instantiate the object if the parameters do not make sense as it might be a bit late in deployment.

sakoush avatar Sep 24 '21 14:09 sakoush

Interesting proposal on using a class method, I wonder how that would work? Typically validation of arguments would happen inside the object methods (in the case of explainers __init__, fit and explain). Or were you thinking of having e.g. AnchorImage.validate_parameters which is called by the application layer instead of within alibi on creation of the object? In that case it almost feels like it makes sense to define it in the application layer since the library wouldn't be using it on object creation.

I think we also need to delineate between __init__, fit and explain parameters. Before instantiating the object we can really only validate those passed in __init__, for fit we can do the same and also have a check that before calling explain fit has been called, but for explain we won't know the parameters until runtime which we can validate then (but this would be after deployment in a production scenario).

jklaise avatar Sep 24 '21 14:09 jklaise

One way of approaching this is using pydantic decorator: https://pydantic-docs.helpmanual.io/usage/validation_decorator/

This has some nice features:

  • Uses the method type hints to validate and parse input arguments against
  • Enables validation without calling the method

A couple of things to think about:

  • For complex types (e.g. np.ndarray) we would need to define our own types so that pydantic can parse/validate these (I need to check how far we can take this)
  • Related to above, actual validation is technically not supported: https://pydantic-docs.helpmanual.io/usage/validation_decorator/#config-and-validators
  • pydantic is ultimately a parsing library not a validation library, i.e. given a data model (in this case derived from method type hints), it coerces input args into the expected types and raises an error if it can't do that. It does not facilitate custom validation (e.g "this array must be 3-dimensional") due to the point above. Given this, is there any advantage in rolling our own validation solution?
  • If going with our own validation in the usual way (call a custom validation function when entering a method), it would add some overhead for calls when the data doesn't need to be validated, although this should be negligible.
  • More importantly, if we want to be able to validate method arguments without calling the method, this may need to be implemented in some other way, e.g. as a class method: AnchorImage.validate_parameters(method='explain', **params) or (similar to pydantic solution) AnchorImage.explain.validate_parameters(**params), is something like this a decent design choice? Or do we care that validation may only happen once the underlying method is called? If we want to do validation at method runtime as default we would be duplicating validation calls?

@sakoush @ascillitoe would be good to hear your thoughts.

jklaise avatar Oct 19 '21 09:10 jklaise

The point about pydantic not being a validation library is a good one. I can foresee quite a few pitfalls due to this, e.g. not validating dimensions, and not catching incorrect types as they are parsed OK (int to str etc). That being said, pydantic "validation" seems to be so easy to implement, that I don't see much harm in adding it to catch really obvious issues, apart from it being another dependency?

I can't really think of an easy all-encompassing validation solution, but one direction we could take is to try to better separate the initial validation checks from the expensive instantiation tasks. In the case of alibi-detect this could be moving calibration to a .fit() or .calibrate() method, so that instantiating (and validating) the detector is cheap (not so sure what this would look like for alibi explainers). Your final bullet is perhaps somewhat related to this?

ascillitoe avatar Oct 19 '21 09:10 ascillitoe

p.s. my comment above is primarily related to validating __init__ args. @jklaise your final bullet could work nicely for validating fit and explain args. But, to avoid adding numerous extra class methods for validation, could we instead have a validate kwarg for __init__, fit and explain. If validate==True, we do some bespoke validation, and raise a custom ValidationError exception if there are any issues. We would only go on to do the heavy lifting in these methods if validate!=True. This would involve a non-trivial amount of boilerplate code though...

ascillitoe avatar Oct 19 '21 10:10 ascillitoe

I've been investigating beartype as it seems to do the the two things we want to do together better than any other solutions:

  • validate argument types at runtime (at little-to-no performance cost!)
  • be able to do custom/more complex validation on certain types (e.g. an np.ndarray that needs to be 3-dimensional).

To gain basic argument validation at runtime would be as easy as decorating all public interface functions (so mostly __init__, fit and explain of explainers) with @beartype. That being said, deep type-checking of certain Python containers (such as typing.Dict) and callables (such as typing.Callable) is not yet supported, although it is on the roadmap: https://github.com/beartype/beartype/issues/53. Taking a concrete example, AnchorTabular:

class AnchorTabular(Explainer, FitMixin):

    @beartype
    def __init__(self,
                 predictor: Callable[[np.ndarray], np.ndarray],
                 feature_names: List[str],
                 categorical_names: Dict[int, List[str]] = None,
                 ohe: bool = False,
                 seed: int = None) -> None:

The following call will succeed because of the shallow type-checking of predictor and categorical_names:

def predictor(x: int) -> int: return x  # unsupported type, but passes due to shallow check for Callable
feature_names = []  # this is deeply type-checked, but empty list counts as List[str], but e.g. [1] would fail
categorical_names = {'random': 0, 1: 'random'} # unsupported type, but passes due to shallow check for Dict
AnchorTabular(predictor, feature_names, categorical_names)

Main question then is, is this good enough for the time being (@sakoush)?

Doing deep type-checking on containers at minimal performance cost is a hard problem, but beartype may get this in the future.

On the other hand, does it matter we explicitly check that e.g. categorical_names is of type Dict[int, List[str]]? If it does, then this would need to be implemented in a way that can be bypassed (e.g. as a classmethod following the discussion above) when not strictly needed (i.e. checking that all keys of a Dict are int and all values are List[str] is prohibitive, beartype would do this statistically to guarantee O(1) runtime performance at calltime when/if deep type-checking of Dict is supported).

That being said, I believe the above deep type-checking can be achieved already by custom beartype validators at the expense of obscuring the type expected with a custom type, i.e. replacing Dict[int, List[str]] with some type DictIntToListStr = specific_beartype_validator, imo this is undesirable as the public API documentation becomes littered with custom types.

jklaise avatar Oct 20 '21 15:10 jklaise