scikit-rebate icon indicating copy to clipboard operation
scikit-rebate copied to clipboard

Add sklearn.utils.check_array to fit and predict

Open mpearmain opened this issue 6 years ago • 0 comments

Currently the fit method fails if you pass a pandas dataframe object to the fit() and predict() adding using the sklearn util check_array (http://scikit-learn.org/stable/modules/generated/sklearn.utils.check_array.html#sklearn.utils.check_array) will by default convert the pandas df to an at 2D numpy array which can then be used without code change from the user.

i.e In the examples you load data as a data frame

genetic_data = pd.read_csv('https://github.com/EpistasisLab/scikit-rebate/raw/master/data/'
                           'GAMETES_Epistasis_2-Way_20atts_0.4H_EDM-1_1.tsv.gz',
                           sep='\t', compression='gzip')
# 
# Now we convert to a numpy array
#
features, labels = genetic_data.drop('class', axis=1).values, genetic_data['class'].values

This would be as simple as changing (in fit() and 'predict()`)

 self._X = check_array(X)
 self._y = column_or_1d(y)

mpearmain avatar Jul 19 '18 08:07 mpearmain