pywFM
pywFM copied to clipboard
Predict new data without training.
Can I predict new data by trained model? Or I always should call "run" method?
Do you mean using a previous training model (using the save_model
flag)?
exactly
Sorry, I have yet to implement that function since me personally had never use for it.
Just to get the feeling how do you envision such an interface? When I started looking into this problem I felt that I would probably need to split pywFM.run
into pywFM.train
and pywFM.predict
, also adding pywFM.load_model
that's able to load a train model. Problem is that would probably hurt performance since we would need to run 2 different libfm
commands: one with save_model
flag another with load_model
flag.
Another alternative would be a separate pywFM.train_model
and pywFM.run_model
that trains and runs a model respectively.
I think first approach with train and predict methods more standard and clean, like in sklearn. Without that feature many ml techniques like stacking,blending become not so trivial. About performance , yes we should run two libfm commands , but this hurt only for training phase,in predicting stage you need only load model for predict.
I'm also leaning towards that approach, since it meets one of my todo points
Improve the save_model / load_model so we can have a more defined init-fit-predict cycle (perhaps we could inherit from sklearn.BaseEstimator)
This weekend I have a little bit of time and I will start to work on this branch (that will break BC, so bumping version). Feel free to also submit changes
Sorry, I'm not saw your todo . Thank you very much for future work.
Hi @jfloff, any advances into that direction? I just realized the issue mentioned by @nickflamel and this simply makes your (very cool) wrapper not usable in a production environment. Btw. I tried the example from Rendle that is also on your README but the prediction is very bad. I guess this is because we don't have much data, but this kinda makes the example unsuitable^^.
I'm sorry, I haven't had time to dedicate to improving this. I realise that this feature would really improve running several different predictions, and I really want to improve it, but if I'm going to do it, I will inherit from sklearn.BaseEstimator right from the start (which takes a little bit more work).
I have a deadline for Monday. After that I'll dig into this, I promise! :)
The example is just to show how the API works, and what's the flow of libfm :)
It seems that predict without a new train is not really supported at this moment. It seems that the functionality is not at 100% (e.g. not working for MCMC). I've also taken a look at libFM source code but I haven't had much success. Documentation is also lacking the save_model and load model function.
I'm going on a limb here and ping @thierry-silbermann here since he was responsible for save_model and load_model in libFM. Could you give us some insight on how we should proceed
Hi, here is how we could proceed to make a predict method:
https://github.com/jilljenn/TF-recomm/blob/master/forward.py#L22
Where the pickled elements are those:
https://github.com/jilljenn/TF-recomm/blob/master/fm_mangaki.py#L39
Want to try submit a PR for this?
Yes. It will look like this.
https://github.com/mangaki/mangaki/pull/549/files#diff-2b98b5dc82ffbac20dd8c88ce88d6b5cR65
I don't know why I had sometimes to use .A1
(conversion matrix to ndarray), sometimes not.
Can you consider casting model.weights
and model.pairwise_interactions
to NumPy arrays?
I don't see any problem with that.
5 years later, I finally made a scikit-learn estimator: https://github.com/jilljenn/ktm/blob/master/fm.py#L25
It will be improved over the next few days, then I can copy it in your repo.