thundersvm icon indicating copy to clipboard operation
thundersvm copied to clipboard

Using precomputed kernels generates errors

Open kiramt opened this issue 7 years ago • 2 comments

I am trying to run ThunderSVM with a precomputed kernel, but have run into a few problems. Could you take a look?

My data works fine with libSVM:

kira@SYN-LAP-011:~/Software/libsvm-3.22$ ./svm-train -c 1 -t 4 /tmp/train_kernel.out 
*.*
optimization finished, #iter = 17
nu = 0.285714
obj = -3.296726, rho = -1.035408
nSV = 11, nBSV = 2
Total nSV = 11
kira@SYN-LAP-011:~/Software/libsvm-3.22$ ./svm-predict /tmp/libsvmkernel.out libsvmkernel.out.model test.out
Accuracy = 85.7143% (6/7) (classification)

If I try to run this directly with ThunderSVM, it complains that the kernel type is unknown:

kira@SYN-LAP-011:~/Software/thundersvm/build$ ./bin/thundersvm-train -c 1 -t 4 /tmp/train_kernel.out
2018-05-11 15:31:44,945 ERROR [default] unknown kernel type
2018-05-11 15:31:44,945 INFO [default] Usage (same as LibSVM): thundersvm [options] training_set_file [model_file]
options:
<output cut here>

If I try to run ThunderSVM via python, training runs (very differently to libSVM) and prediction crashes:

kira@SYN-LAP-011:~/Software/thundersvm/python$ python3
Python 3.5.2 (default, Nov 23 2017, 16:37:01) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import svm
>>> y,x=svm.svm_read_problem("/tmp/train_kernel.out")
>>> svm.svm_train(y,x,"train_kernel.out.model","-c 1 -t 4")
2018-05-11 15:36:10,051 INFO [default] #classes = 2
2018-05-11 15:36:10,051 INFO [default] working set size = 8
2018-05-11 15:36:10,051 INFO [default] training start
2018-05-11 15:36:10,105 INFO [default] global iter = 0, total local iter = 100001, diff = 2
2018-05-11 15:36:10,150 INFO [default] global iter = 2, total local iter = 100805, diff = -inf
2018-05-11 15:36:10,150 INFO [default] rho = -nan
2018-05-11 15:36:10,150 INFO [default] #sv = 4
2018-05-11 15:36:10,150 INFO [default] #total unique sv = 4
2018-05-11 15:36:10,150 INFO [default] evaluating training score
2018-05-11 15:36:10,160 INFO [default] SVC::predict_label
2018-05-11 15:36:10,160 INFO [default] Accuracy = 0.142857

>>> y,x=svm.svm_read_problem("/tmp/test_kernel.out")
>>> svm.svm_predict(y,x,"train_kernel.out.model","predictions.txt")
Segmentation fault (core dumped)

My data looks like this: train_kernel.out:

1 0:1 1:1.0 2:7.963494e-02 3:3.325665e-01 4:2.397191e-01 5:3.135345e-01 6:7.399337e-02 7:1.665817e-01 8:2.210360e-01 9:2.850396e-01 10:2.655067e-01 11:8.429855e-02 12:1.382558e-01 13:4.368207e-01 14:3.220871e-01
1 0:2 1:7.963494e-02 2:1.0 3:9.836425e-02 4:2.354587e-01 5:1.784186e-01 6:5.793398e-01 7:2.235396e-01 8:2.659173e-01 9:1.812363e-01 10:1.225234e-01 11:9.808508e-02 12:2.184586e-02 13:1.274673e-01 14:1.816464e-01
1 0:3 1:3.325665e-01 2:9.836425e-02 3:1.0 4:2.387480e-01 5:3.546404e-01 6:7.172984e-02 7:6.795441e-02 8:1.233178e-01 9:2.349564e-01 10:2.161432e-01 11:6.885562e-02 12:1.789989e-01 13:6.059155e-01 14:3.945151e-01
1 0:4 1:2.397191e-01 2:2.354587e-01 3:2.387480e-01 4:1.0 5:2.901807e-01 6:1.974833e-01 7:2.415136e-01 8:3.229411e-01 9:2.156197e-01 10:1.794023e-01 11:3.367458e-02 12:7.756815e-02 13:3.578091e-01 14:4.601892e-01
1 0:5 1:3.135345e-01 2:1.784186e-01 3:3.546404e-01 4:2.901807e-01 5:1.0 6:1.911636e-01 7:1.441595e-01 8:2.800171e-01 9:2.736416e-01 10:2.058720e-01 11:7.576426e-02 12:1.654463e-01 13:4.857011e-01 14:3.976108e-01
1 0:6 1:7.399337e-02 2:5.793398e-01 3:7.172984e-02 4:1.974833e-01 5:1.911636e-01 6:1.0 7:2.434326e-01 8:3.355066e-01 9:2.396896e-01 10:1.564234e-01 11:1.060408e-01 12:5.298542e-02 13:1.395370e-01 14:2.225568e-01
1 0:7 1:1.665817e-01 2:2.235396e-01 3:6.795441e-02 4:2.415136e-01 5:1.441595e-01 6:2.434326e-01 7:1.0 8:1.936384e-01 9:2.378984e-01 10:1.098631e-01 11:8.117840e-02 12:1.818148e-01 13:1.642263e-01 14:2.194416e-01
1 0:8 1:2.210360e-01 2:2.659173e-01 3:1.233178e-01 4:3.229411e-01 5:2.800171e-01 6:3.355066e-01 7:1.936384e-01 8:1.0 9:2.441430e-01 10:1.446756e-01 11:1.171163e-01 12:1.005228e-01 13:3.164326e-01 14:2.763877e-01
1 0:9 1:2.850396e-01 2:1.812363e-01 3:2.349564e-01 4:2.156197e-01 5:2.736416e-01 6:2.396896e-01 7:2.378984e-01 8:2.441430e-01 9:1.0 10:2.312299e-01 11:6.747934e-02 12:1.459097e-01 13:3.401782e-01 14:3.229424e-01
1 0:10 1:2.655067e-01 2:1.225234e-01 3:2.161432e-01 4:1.794023e-01 5:2.058720e-01 6:1.564234e-01 7:1.098631e-01 8:1.446756e-01 9:2.312299e-01 10:1.0 11:1.391822e-01 12:8.440796e-02 13:2.577245e-01 14:2.511043e-01
1 0:11 1:8.429855e-02 2:9.808508e-02 3:6.885562e-02 4:3.367458e-02 5:7.576426e-02 6:1.060408e-01 7:8.117840e-02 8:1.171163e-01 9:6.747934e-02 10:1.391822e-01 11:1.0 12:3.511793e-02 13:5.367907e-02 14:1.007229e-01
1 0:12 1:1.382558e-01 2:2.184586e-02 3:1.789989e-01 4:7.756815e-02 5:1.654463e-01 6:5.298542e-02 7:1.818148e-01 8:1.005228e-01 9:1.459097e-01 10:8.440796e-02 11:3.511793e-02 12:1.0 13:1.261742e-01 14:1.568662e-01
0 0:13 1:4.368207e-01 2:1.274673e-01 3:6.059155e-01 4:3.578091e-01 5:4.857011e-01 6:1.395370e-01 7:1.642263e-01 8:3.164326e-01 9:3.401782e-01 10:2.577245e-01 11:5.367907e-02 12:1.261742e-01 13:1.0 14:5.602320e-01
0 0:14 1:3.220871e-01 2:1.816464e-01 3:3.945151e-01 4:4.601892e-01 5:3.976108e-01 6:2.225568e-01 7:2.194416e-01 8:2.763877e-01 9:3.229424e-01 10:2.511043e-01 11:1.007229e-01 12:1.568662e-01 13:5.602320e-01 14:1.0

test_kernel.out:

1 0:1 1:1.0 2:3.886128e-01 3:3.005489e-02 4:6.381688e-01 5:6.182784e-01 6:6.466103e-03 7:0.000000e+00
1 0:2 1:3.886128e-01 2:1.0 3:3.515368e-02 4:4.655570e-01 5:6.892350e-01 6:1.685364e-02 7:0.000000e+00
1 0:3 1:3.005489e-02 2:3.515368e-02 3:1.0 4:0.000000e+00 5:1.245842e-01 6:3.947336e-01 7:0.000000e+00
1 0:4 1:6.381688e-01 2:4.655570e-01 3:0.000000e+00 4:1.0 5:7.107565e-01 6:0.000000e+00 7:0.000000e+00
1 0:5 1:6.182784e-01 2:6.892350e-01 3:1.245842e-01 4:7.107565e-01 5:1.0 6:5.534643e-02 7:0.000000e+00
1 0:6 1:6.466103e-03 2:1.685364e-02 3:3.947336e-01 4:0.000000e+00 5:5.534643e-02 6:1.0 7:0.000000e+00
0 0:7 1:0.000000e+00 2:0.000000e+00 3:0.000000e+00 4:0.000000e+00 5:0.000000e+00 6:0.000000e+00 7:1.0

kiramt avatar May 11 '18 14:05 kiramt

Thanks for the feedback! We have fixed the issue. If you still have issues, please let us know. However, we suggest you use the RBF/linear/sigmoid/polynomial kernel instead. Pre-computation doesn't help much for ThunderSVM because of two reasons.

First, the GPU memory is usually much smaller than the host memory, and the pre-computed kernel matrix is often too large to store in the GPU memory. Second, the kernel value computation on-the-fly is fairly fast (due to batch processing and reusage) in ThunderSVM, so pre-computation may not be very necessary.

zeyiwen avatar May 14 '18 04:05 zeyiwen

Thank you for the fix, and the advice. For testing my setup I am using a dataset I have already which happened to have a precomputed kernel, but in practice I expect to be using a custom kernel.

kiramt avatar May 14 '18 07:05 kiramt