skutil icon indicating copy to clipboard operation
skutil copied to clipboard

Output dataframe with SafeLabelEncoder?

Open jmackwinn opened this issue 8 years ago • 0 comments

Hey guys, any tips on how to output a dataframe instead of an array when using SafeLabelEncoder()?

This works for me, but I was really hoping to have an argument similar to as_df=True so I can stay in Pandas-land.

train = pd.DataFrame.from_records(data=np.array([
                           ['USA','RED','a'],
                           ['MEX','GRN','b'],
                           ['FRA','RED','b']]), 
                           columns=['Country','Color','Category'])


test = pd.DataFrame.from_records(data=np.array([
                           ['BBR','RED','a'],
                           ['CAN','BLK','b'],
                           ['FRA','RED','b']]), 
                           columns=['Country','Color','Category'])
    
COLS = ['Country']

# learn the levels on 'Country'
SLC = SafeLabelEncoder().fit(train[COLS].values.ravel())

# create dummies in the train dataset
train_labels = SLC.transform(train[COLS].values.ravel())
    
test_labels = SLC.transform(test[COLS].values.ravel())

print(train_labels)
print(test_labels)

[2 1 0]
[99999 99999     0]

jmackwinn avatar Oct 29 '17 20:10 jmackwinn