autofeat Input contains NaN, infinity or a value too large for dtype('float32') on fit

Input contains NaN, infinity or a value too large for dtype('float32') on fit_transform

Open nickjtch opened this issue 2 years ago • 1 comments

Facing the following issue when running AutoFeatRegressor.fit_transform(featuresDf, targetFeature):

Already checked if there are any infinity values or nan. Also, converted everything to float32. Any pointers? Thanks!

Update: tried the same set of inputs with FeatureSelector and everything is working great.

Update 2: posted this question on StackOverflow

Exception

c:\dox\rnd\ml-pipeline-notebooks\modules\autoFeat.py in augmentFeatures(features, targetFeature, verbose)
     41     print(featuresDf.head())
     42     print(targetFeature)
---> 43     newFeatureDf = autoFeatRegressor.fit_transform(featuresDf, targetFeature)
     44 
     45     if verbose:

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\utils\validation.py in _assert_all_finite(X, allow_nan, msg_dtype)
    112         ):
    113             type_err = "infinity" if allow_nan else "NaN, infinity"
--> 114             raise ValueError(
    115                 msg_err.format(
    116                     type_err, msg_dtype if msg_dtype is not None else X.dtype

ValueError: Input contains NaN, infinity or a value too large for dtype('float32').

Logs (verbose)

Feb 10 '22 10:02 nickjtch

From the StackOverflow question it seems the problem was because one of your original input features had a standard deviation of 0 (i.e., all the same values)? This would also explain the RuntimeWarning shown above when dividing by the stddev.

I guess it might make sense to add a check for zero-variance features somewhere and exclude them.

Feb 10 '22 22:02 cod3licious

I think you can apply scaling methods on your data. it works for me (prefer to use bounded range scaling, ex: StandardScaler or MinMax) you can try Robust scaling but it may give extreme large values too.

also, it will be good if you make clipping for features before apply scaling, but it will work without it anyway

Oct 10 '23 08:10 ahmed-kamal91

autofeat autofeat copied to clipboard

Input contains NaN, infinity or a value too large for dtype('float32') on fit_transform

Exception

Logs (verbose)

autofeat
autofeat copied to clipboard