rankfm
rankfm copied to clipboard
Error while fit with 200k user_interaction matrix, item features and user features
I'm running the lib on a virtual server with 64gb RAM. My data consist of: 200k distinct interaction between users and item 52k x 11 user_feature matrix 2770 x 49 item_feature matrix all NA are replaced by 0
when i try to run it gives me this error: AssertionError: user factors [v_u] are not finite - try decreasing feature/sample_weight magnitudes sometimes it would give me item factors error as well
However, if I run on 170k user interaction without user_features and item_features it would run smoothly
What is the meaning of the error?
Hi I have the same issue as you. Was wondering if you've solved the issue or not?
I have 173k distinct interaction between users and item 4k x 20 item features dataframe 370k x 30 user feature dataframe
when i try to run it gives me this error: AssertionError: item weights [w_i] are not finite - try decreasing feature/sample_weight magnitudes
so now i can only run the model without the item and user auxiliary features.
I too have this same error. Even with my item features dataframe comprising two columns [product_id INT32, retailprice FLOAT64] of 270 rows. I try also with two different columns, [product_id, category_id INT16], and it's the same issue.
I can include sample_weight. If I try to add any user or product attributes I get the error; AssertionError: item weights [w_i] are not finite
model2 = RankFM(factors=20, loss='warp', max_samples=100, learning_schedule='invscaling')
model2.fit(user_item_train,
item_features=item_attributes_train,
#user_features=user_attributes_train,
sample_weight=sample_weight_train,
epochs=25,
verbose=True)
type(item_attributes_train)
pandas.core.frame.DataFrame
item_attributes_train.dtypes
PRODUCT_ID int32
RETAIL_PRICE float64
dtype: object
item_attributes_train.head(3)
PRODUCT_ID RETAIL_PRICE
0 10162 1.75
1 10145 1.00
2 101433 7.95
I may have resolved this issue that I'm facing by re-presenting all the numeric data as scaled values between 0 and 1, and with categories IDs being one-hot encoded (as you'd expect).