thundersvm icon indicating copy to clipboard operation
thundersvm copied to clipboard

Slower training using thunderSVM compared to scikit-learn

Open abbyDC opened this issue 6 years ago • 9 comments

I've installed thunderSVM on my machine with gpu support with no errors but it seems like training doesn't speed up.

Code used: I used the sample code for scikit-learn wrapper interface

Issue: I tried running the code with thunderSVMscikit package and the training time is around 0.3s. When I changed the package back to scikit.svm then the training time become 0.009s.

Is there a parameter that I'm missing to activate the gpu support/to properly train using thunderSVM? Thanks in advance!

abbyDC avatar Nov 20 '18 12:11 abbyDC

Your data set seems to be too small to demonstrate the superiority of GPUs. I suggest you use a bigger data set to compare the libraries.

zeyiwen avatar Nov 21 '18 04:11 zeyiwen

Thanks for the suggestion and quick reply! I've tried testing it again and I still get relatively same performance. My GPU is an NVIDIA GTX 1050.

dataset (1.6GB): scikit-learn svm training ~170s thunderscm scikit ~220s

dataset (32GB): error midway using thundersvm related to "std::bad_alloc"

abbyDC avatar Nov 21 '18 06:11 abbyDC

A bit strange. The data set of 32GB may not be able to store in your GTX1050, as it only has 2GB memory.

Could you try to train the SVM using command line? In order to help us better understand the problem, you may post the output of the program.

zeyiwen avatar Nov 21 '18 10:11 zeyiwen

I have the same issue. In my case I use the CPU version. But the Sklearn is 6 times faster. I build Thundersvm on my machine. My dataset is 4921, 512. Training takes 30sec with sklearn; it takes 3minutes with Thundersvm.

siarez avatar Jul 16 '19 20:07 siarez

Hmm. Would you please share the data set and the script to help us reproduce the problem? I have reopened it, and would definitely fix it if we can reproduce the problem.

zeyiwen avatar Jul 17 '19 14:07 zeyiwen

Thanks for your reply. Unfortunately I can't share the dataset due to privacy issues. But I tell you what it is. It is DCT components of facial images. So an image of a face is transforms using OpenCV's DCT (discrete cosine transform) and the first 512 components of the DCT are used as features. One thing I can high light about these features is that they are on wildly different scales. Not sure it it should effect the run time though.

Another info that may have some bearing on this issue is that the PC I'm running this on is old. The CPU is a Pentium G6950 2.80GHz × 2 and does not support AVX instructions.

siarez avatar Jul 17 '19 16:07 siarez

Same, I am using 3070 one epoch of testing on thundersvm(only SVR function) is 30 seconds and it takes 4 seconds in sklearn(only SVR function).

KaitaoQiu avatar Dec 28 '21 07:12 KaitaoQiu

@Qiu-INJ can you try constructing SVM outside a loop and fit data in a loop for x times? In my case construction was taking longer time compared to sklearn.

acakici2020 avatar Jan 01 '22 22:01 acakici2020

@Qiu-INJ can you try constructing SVM outside a loop and fit data in a loop for x times? In my case construction was taking longer time compared to sklearn.

Sure, I will try it after holiday ; ) ty!!

KaitaoQiu avatar Jan 02 '22 03:01 KaitaoQiu