alibi-detect
alibi-detect copied to clipboard
Handle cases where data provided to predict is different to reference data
In some circumstances, the data provided to detector predict
methods, may inadvertently be of a different dtype
to the previously provided reference data. This can lead to unclear errors, for example:
x_ref = np.random.normal(size=(100,5)).astype(np.float32)
cd = MMDDrift(x_ref)
cd.predict(x_ref.astype(np.float64))
leads to:
tensorflow.python.framework.errors_impl.InvalidArgumentError: cannot compute AddV2 as input #1(zero-based) was expected to be a float tensor but is a double tensor [Op:AddV2]
To avoid unclear errors, we could handle this in a number of ways:
- Attempt to cast data provided to
predict
to the samedtype
asself.x_ref
. This could potentially have unintended consequences. - Raise a descriptive warning (or error) if the
dtype
's don't match.
Opening this for discussion. @jklaise @arnaudvl @mauicv
I would say we check the dtype
for the predict
method and if it doesn't match the reference dtype
we raise a custom error. I agree that casting ourselves is potentially dangerous.
Agreed. It might even be best to just raise a warning, as many detectors do still work if dtypes
don't match (e.g. those without a backend). I think @arnaudvl is in favour of avoiding any explicit casting too.
I think we should be pretty strict, at least in the case of the detectors with a backend as we know they will fail. In those cases it's better to raise a readable custom exception quickly rather than rely on logging as that would result in the unreadable error being raised anyway which is not good for downstream applications, especially one's that don't necessarily configure logging.