mlpack
mlpack copied to clipboard
Feat/relevance vector machine
Dear all,
I open this pull request as discussed in #https://github.com/mlpack/mlpack/issues/2809#issue-788181981. :)
I added the code corresponding to the Relevance Vector Machine and ARD regression. The class methods work and give me the same results as my python implementation, that is a good point. The code is pretty old now and I suppose that it does not fit with the API.
Here are the major points :
- Change the comments
- Format the code to the guidelines
- Solve the question, are the templates well used in this implementation ?
- Rewrite and continue to write unitary tests without boost.
- Finish the main program.
Let me know those I forget.
Hoping that this feature will born one day :)
For now I face some problems when I want to compile any command line program. I did not have this problem with the last old mlpack version. Do you know if it is specific to my configuration or if it is a known issue ?
make[3]: *** No rule to build the target « ../src/mlpack/core/boost_backport/boost_backport_serialization.hpp », nécessaire pour « src/mlpack/cotire/mlpack_CXX_prefix.hxx.gch ». Arrêt. make[2]: *** [CMakeFiles/Makefile2:2431 : src/mlpack/CMakeFiles/mlpack.dir/all] Erreur 2 make[1]: *** [CMakeFiles/Makefile2:3873 : src/mlpack/methods/rvm_regression/CMakeFiles/mlpack_rvm_regression.dir/rule] Erreur 2 make: *** [Makefile:916 : mlpack_rvm_regression] Erreur 2
For now I face some problems when I want to compile any command line program. I did not have this problem with the last old mlpack version. Do you know if it is specific to my configuration or if it is a known issue ?
make[3]: *** No rule to build the target « ../src/mlpack/core/boost_backport/boost_backport_serialization.hpp », nécessaire pour « src/mlpack/cotire/mlpack_CXX_prefix.hxx.gch ». Arrêt. make[2]: *** [CMakeFiles/Makefile2:2431 : src/mlpack/CMakeFiles/mlpack.dir/all] Erreur 2 make[1]: *** [CMakeFiles/Makefile2:3873 : src/mlpack/methods/rvm_regression/CMakeFiles/mlpack_rvm_regression.dir/rule] Erreur 2 make: *** [Makefile:916 : mlpack_rvm_regression] Erreur 2
Yes, we switched from boost serializatioon too cereal. So I guess if you comment the serialization part the issue should go away.
You might need to clean/remove your CMake build directory and reconfigure/rebuild. (There could be an easier way to solve it without nuking the build directory, but, that strategy should work, at least. :))
Thanks @mercierc for coming up with the PR, this was a fairly easy doubt which zoq solved, Also we changed our testing framework from boost to catch2 so you might want to update that too. Apart from that I don't remember any other api changes
Thank you all. The solution was to build from a new directory with libcereal-dev intstalled. I thought that remove the CMakeCache.txt was sufficient but not.
Dear all,
Before to continue to write the main program I would like to know your opinion about the right way to do it.
My first idea was to use a switch to manage the many kernel choices and instantiate the corresponding RVMRegression estimator. I quickly realized that because of the template nature of RVMRegression<> class these declarations needed to be done for each case of the switch instead of doing that only for the kernels. It's pretty redundant and I wonder if there is a way to factorize or if we do not care. It is may not be a problem for you, I do not know. For information the template nature of the class comes from the kernel pca class from which it is inspired.
Here is my questioning of the moment. If you have any opinion let me know :).
Do you think that it is possible to change the kernelType on fly of the RVMRegression<kernelType> as it could be done for the private attributes once the model is created with an auxiliary serializable class (RVMRegressionModel in rvm_regression_main.cpp) ?
Is it then possible to imagine a single main program for all kernels ?
Do you think that it is possible to change the kernelType on fly of the RVMRegression as it could be done for the private attributes once the model is created with an auxiliary serializable class (RVMRegressionModel in rvm_regression_main.cpp) ? Is it then possible to imagine a single main program for all kernels ?
I think it's feasible to use a if/else construct similar to https://github.com/mlpack/mlpack/blob/master/src/mlpack/methods/cf/cf_main.cpp#L260-L299
that said I'm not sure what you had in mind with changing the kernel type on the fiy in the context of an executable.
I think that I clearly need a multiple type, single value solution. From the cf code you pointed to me, the main program manipulates a serializable class version (cfmodel) of the template cfType class. The boost::variant feature is used and it could be the solution. Is it a satisfactory solution or do you have any other ideas that do not require the boost lib ? From what I read, I suppose you want to remove boost at least at possible.
I think that the biggest part of the code is pushed :) Did you ever faced some of the build failures that happened ? Some of them surprise me, for example, the MacOsPlain build failed at each test during the first instantiation of an empty armadillo object :
OptionsMakeRVMDifferent
/Users/runner/work/1/s/src/mlpack/tests/rvm_regression_test.cpp:37
corresponds to arma::mat matX;
Hey @mercierc, thanks for the contribution! I've kind of hidden from this one a little bit because I don't (yet) know RVMs and it's a big contribution so it takes a little while to review. :) But, I am hoping to make some time for it very soon.
It looks like maybe the issue is that some row vectors are being initialized as column vectors or vice versa? You could probably debug this by compiling with -DDEBUG=ON
, and then using a debugger like gdb
to figure out where the exception is being thrown.
I can try and dig in too when I have a chance, but it could be a little while, unfortunately. :(
This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! :+1:
According to the nature of the remaining not valid checks, it seems that the code is as good as it can be before the improvements of the community members :) For information, I tested the python bindings and they work as well.
One of the test fails because a the estimation of one coefficient of a solution vector is not close enough of its real value for a tolerance of 0.1 for one random run : 3.3 instead of 3.0. Do you think that increase the tolerance would the good choice ?
This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! :+1:
Hey, I made the changes in order to make this code compliant with the new binding definition.
I see that you get rid of the boost variant in CFModel. According the comment of the 11 march 2021, does a std::variant could be a satisfactory solution to the boost::variant ?