mlpack icon indicating copy to clipboard operation
mlpack copied to clipboard

Feat/relevance vector machine

Open mercierc opened this issue 3 years ago • 15 comments

Dear all,

I open this pull request as discussed in #https://github.com/mlpack/mlpack/issues/2809#issue-788181981. :)

I added the code corresponding to the Relevance Vector Machine and ARD regression. The class methods work and give me the same results as my python implementation, that is a good point. The code is pretty old now and I suppose that it does not fit with the API.

Here are the major points :

  • Change the comments
  • Format the code to the guidelines
  • Solve the question, are the templates well used in this implementation ?
  • Rewrite and continue to write unitary tests without boost.
  • Finish the main program.

Let me know those I forget.

Hoping that this feature will born one day :)

mercierc avatar Jan 20 '21 07:01 mercierc

For now I face some problems when I want to compile any command line program. I did not have this problem with the last old mlpack version. Do you know if it is specific to my configuration or if it is a known issue ?

make[3]: *** No rule to build the target « ../src/mlpack/core/boost_backport/boost_backport_serialization.hpp », nécessaire pour « src/mlpack/cotire/mlpack_CXX_prefix.hxx.gch ». Arrêt. make[2]: *** [CMakeFiles/Makefile2:2431 : src/mlpack/CMakeFiles/mlpack.dir/all] Erreur 2 make[1]: *** [CMakeFiles/Makefile2:3873 : src/mlpack/methods/rvm_regression/CMakeFiles/mlpack_rvm_regression.dir/rule] Erreur 2 make: *** [Makefile:916 : mlpack_rvm_regression] Erreur 2

mercierc avatar Jan 20 '21 07:01 mercierc

For now I face some problems when I want to compile any command line program. I did not have this problem with the last old mlpack version. Do you know if it is specific to my configuration or if it is a known issue ?

make[3]: *** No rule to build the target « ../src/mlpack/core/boost_backport/boost_backport_serialization.hpp », nécessaire pour « src/mlpack/cotire/mlpack_CXX_prefix.hxx.gch ». Arrêt. make[2]: *** [CMakeFiles/Makefile2:2431 : src/mlpack/CMakeFiles/mlpack.dir/all] Erreur 2 make[1]: *** [CMakeFiles/Makefile2:3873 : src/mlpack/methods/rvm_regression/CMakeFiles/mlpack_rvm_regression.dir/rule] Erreur 2 make: *** [Makefile:916 : mlpack_rvm_regression] Erreur 2

Yes, we switched from boost serializatioon too cereal. So I guess if you comment the serialization part the issue should go away.

zoq avatar Jan 20 '21 21:01 zoq

You might need to clean/remove your CMake build directory and reconfigure/rebuild. (There could be an easier way to solve it without nuking the build directory, but, that strategy should work, at least. :))

rcurtin avatar Jan 20 '21 21:01 rcurtin

Thanks @mercierc for coming up with the PR, this was a fairly easy doubt which zoq solved, Also we changed our testing framework from boost to catch2 so you might want to update that too. Apart from that I don't remember any other api changes

jeffin143 avatar Jan 22 '21 15:01 jeffin143

Thank you all. The solution was to build from a new directory with libcereal-dev intstalled. I thought that remove the CMakeCache.txt was sufficient but not.

mercierc avatar Jan 24 '21 08:01 mercierc

Dear all,

Before to continue to write the main program I would like to know your opinion about the right way to do it.

My first idea was to use a switch to manage the many kernel choices and instantiate the corresponding RVMRegression estimator. I quickly realized that because of the template nature of RVMRegression<> class these declarations needed to be done for each case of the switch instead of doing that only for the kernels. It's pretty redundant and I wonder if there is a way to factorize or if we do not care. It is may not be a problem for you, I do not know. For information the template nature of the class comes from the kernel pca class from which it is inspired.

Here is my questioning of the moment. If you have any opinion let me know :).

mercierc avatar Feb 03 '21 08:02 mercierc

Do you think that it is possible to change the kernelType on fly of the RVMRegression<kernelType> as it could be done for the private attributes once the model is created with an auxiliary serializable class (RVMRegressionModel in rvm_regression_main.cpp) ?
Is it then possible to imagine a single main program for all kernels ?

mercierc avatar Feb 17 '21 10:02 mercierc

Do you think that it is possible to change the kernelType on fly of the RVMRegression as it could be done for the private attributes once the model is created with an auxiliary serializable class (RVMRegressionModel in rvm_regression_main.cpp) ? Is it then possible to imagine a single main program for all kernels ?

I think it's feasible to use a if/else construct similar to https://github.com/mlpack/mlpack/blob/master/src/mlpack/methods/cf/cf_main.cpp#L260-L299

that said I'm not sure what you had in mind with changing the kernel type on the fiy in the context of an executable.

zoq avatar Feb 28 '21 23:02 zoq

I think that I clearly need a multiple type, single value solution. From the cf code you pointed to me, the main program manipulates a serializable class version (cfmodel) of the template cfType class. The boost::variant feature is used and it could be the solution. Is it a satisfactory solution or do you have any other ideas that do not require the boost lib ? From what I read, I suppose you want to remove boost at least at possible.

mercierc avatar Mar 11 '21 15:03 mercierc

I think that the biggest part of the code is pushed :) Did you ever faced some of the build failures that happened ? Some of them surprise me, for example, the MacOsPlain build failed at each test during the first instantiation of an empty armadillo object :

OptionsMakeRVMDifferent

/Users/runner/work/1/s/src/mlpack/tests/rvm_regression_test.cpp:37

corresponds to arma::mat matX;

mercierc avatar Mar 25 '21 21:03 mercierc

Hey @mercierc, thanks for the contribution! I've kind of hidden from this one a little bit because I don't (yet) know RVMs and it's a big contribution so it takes a little while to review. :) But, I am hoping to make some time for it very soon.

It looks like maybe the issue is that some row vectors are being initialized as column vectors or vice versa? You could probably debug this by compiling with -DDEBUG=ON, and then using a debugger like gdb to figure out where the exception is being thrown.

I can try and dig in too when I have a chance, but it could be a little while, unfortunately. :(

rcurtin avatar May 11 '21 22:05 rcurtin

This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! :+1:

mlpack-bot[bot] avatar Jun 10 '21 23:06 mlpack-bot[bot]

According to the nature of the remaining not valid checks, it seems that the code is as good as it can be before the improvements of the community members :) For information, I tested the python bindings and they work as well.

One of the test fails because a the estimation of one coefficient of a solution vector is not close enough of its real value for a tolerance of 0.1 for one random run : 3.3 instead of 3.0. Do you think that increase the tolerance would the good choice ?

mercierc avatar Jul 12 '21 19:07 mercierc

This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! :+1:

mlpack-bot[bot] avatar Aug 11 '21 19:08 mlpack-bot[bot]

Hey, I made the changes in order to make this code compliant with the new binding definition.

I see that you get rid of the boost variant in CFModel. According the comment of the 11 march 2021, does a std::variant could be a satisfactory solution to the boost::variant ?

mercierc avatar Sep 21 '21 06:09 mercierc