ATM icon indicating copy to clipboard operation
ATM copied to clipboard

Support for classifiers accepting sparse matrix formats only

Open beevabeeva opened this issue 4 years ago • 0 comments

Hi @csala!,

Sorry it's been so long!

I've been making progress with my research but hit a snag recently when trying to implement the state of the art SVM ThunderSVM. Unfortunately, ThunderSVM only accepts sparse matrix format as the data input. It is based on a another SVM implementation (LIBSVM). LIBSVM provides a tool to translate between input formats. My question is, how hard would it be to implement a feature into ATM that would:

  1. Use the LIBSVM conversion tool in a 'pre-process' sort of stage in ATM, to convert standard input data to LIBSVM sparse format.
  2. Use the converted data from step 1 when calling the ThunderSVM (or any other similar classifier) classifier.
  3. Use the results from step 2 as normal, storing them in the ModelDB hub.

I am looking at alternatives to ThunderSVM in case this is not possible/ really hard. I think implementing standard input into ThunderSVM itself would be even harder, but I will open an issue on their repo too.

** As a side note, when I do use ThunderSVM in ATM, it seems that ATM runs it as usual (but it does get stuck sometimes). It's probably just that ATM is interpreting the sparse format as standard input. But it is curious that is isn't breaking. Let me know if I should open a separate issue and fill in all the technical details.

Thanks!

beevabeeva avatar Sep 21 '19 14:09 beevabeeva