prince
prince copied to clipboard
FAMD row_contributions
I was trying use row_contributions on FAMD to recover the contributions of each individual column on the principal components, but it throws me an error. Wasn't sure if it is because it hasn't been built?
Hey! Can you share some code?
I sadly don't have the time to make a minimum reproducible example here, but I figured I'd at least note that I also am having the above issue and would flesh it out a bit with some details. The error I get is this:
KeyError: "None of [Index(['outlet_id', 'rain_drizzle', 'prod_txt'], dtype='object')] are in the [columns]"
Of note, those three columns are the only categorical variables in my code; apparently trying to pass them to row_contributions fails. A bit more detail about the example I'm using. The chunk of code that throws this error is here (note the commented conversion of the three categorical columns to strings; I tried that too but it didn't fix this):
import prince
df=df.dropna()
y=df['total_quantity_sold']
X=df[['rain_drizzle','prod_txt','mean_temperature','min_temp','max_temp','precipitation','outlet_id','SATURDAY','SUNDAY','visibility','snow_depth']]
#string_cols = ['rain_drizzle','prod_txt','outlet_id']
#X[string_cols]=X[string_cols].astype(str)
famd = prince.FAMD(
n_components=X.shape[1],
n_iter=10,
copy=True,
check_input=True,
engine='auto',
random_state=42)
famd=famd.fit(X)
#famd.explained_inertia_ # Note, when uncommented this line works fine
print(famd.row_contributions(X))
Also, a printout of the dtypes of the X dataframe:
rain_drizzle category
prod_txt category
mean_temperature float64
min_temp float64
max_temp float64
precipitation float64
outlet_id category
SATURDAY int64
SUNDAY int64
visibility float64
snow_depth float64
dtype: object
Know that's not a lot more helpful than the above, but hopefully better than a complete lack of an error message like the original report had.
Also quite possible I'm just using this wrong; hard to know with this section of documentation not yet complete though.
Same here
X = pd.DataFrame(
data=[
['A', 'A', 'A', 2, 5, 7, 6, 3, 6, 7],
['A', 'A', 'A', 4, 4, 4, 2, 4, 4, 3],
['B', 'A', 'B', 5, 2, 1, 1, 7, 1, 1],
['B', 'A', 'B', 7, 2, 1, 2, 2, 2, 2],
['B', 'B', 'B', 3, 5, 6, 5, 2, 6, 6],
['B', 'B', 'A', 3, 5, 4, 5, 1, 7, 5]
],
columns=['E1 fruity', 'E1 woody', 'E1 coffee',
'E2 red fruit', 'E2 roasted', 'E2 vanillin', 'E2 woody',
'E3 fruity', 'E3 butter', 'E3 woody'],
index=['Wine {}'.format(i+1) for i in range(6)]
)
X['Oak type'] = [1, 2, 2, 2, 1, 1]
famd = prince.FAMD(
n_components=2,
n_iter=3,
copy=True,
check_input=True,
engine='auto',
random_state=42
)
famd = famd.fit(X.drop('Oak type', axis='columns'))
famd.row_contributions(X)
Hello there π
I apologise for not answering earlier. I was not maintaining Prince anymore. However, I have just refactored the entire codebase. This refactoring should have fixed many bugs.
I donβt have time and energy to check if this fixes your issue, but there is a good chance it does. Feel free to reopen this issue if the problem persists after installing the new version β that is, version 0.8.0 and onwards.