evalml icon indicating copy to clipboard operation
evalml copied to clipboard

LogTransformer can raise `TypeValidationError` if integer nullable y is passed in

Open tamargrey opened this issue 2 years ago • 1 comments

    import woodwork as ww
    X = pd.DataFrame({
            "nullable bool col": [True, False, False, True, True] * 4,
            "nullable int col": [0, 1, 2, 0, 3] * 4,
    })
    X.ww.init()
    y = pd.Series([1, 3]*(len(X)//2))
    y = ww.init_series(y, logical_type="IntegerNullable")
    comp = LogTransformer()
    comp.fit(X, y)
    comp.transform(X, y)

Note - this doesn't happen if the data is [0, 1] instead of [0,3], and I'm not sure why, but I wonder if it is somehow related to booleans?

This can be fixed by replacing this line with a call to ww.init_series.

tamargrey avatar Feb 02 '23 19:02 tamargrey

I think this only happens if y's min value is greater than 0 - it's the call to y_ww.apply(np.log) that must maintain the nullable type, singe the apply above it gets rid of the nullable type

tamargrey avatar Feb 24 '23 18:02 tamargrey