ramp-workflow icon indicating copy to clipboard operation
ramp-workflow copied to clipboard

Support for pandas dataframes in workflows

Open h2o64 opened this issue 2 years ago • 1 comments

Currently, the Classifier and Regressor workflows don't accept pandas dataframes as inputs.

Indeed in train_submission the arrays are being indexed with slices which selects columns instead of rows leading to the following error message

KeyError: "None of [Int64Index([ 256,  127,  753,  439,  825, 1167,  786, 1689, 1615,  675,\n            ...\n             721, 1064,  696, 1122,  632, 1103,  406, 1029, 1750,  975],\n           dtype='int64', length=1451)] are in the [columns]"

h2o64 avatar Mar 13 '22 20:03 h2o64

Usually a pandas data frame is expected as input of the feature extractor but a numpy array is expected as input of the classifier/regressor.

albertcthomas avatar Mar 13 '22 21:03 albertcthomas