thundersvm icon indicating copy to clipboard operation
thundersvm copied to clipboard

The probability from sklearn is different from thundersvm

Open ZesenChen opened this issue 6 years ago • 4 comments

Hello, I try to use thundersvm to replace SVC in sklearn, but I found that the probability result is a little poor (but the binary classification is as good as sklearn). And I write the code as follows to have a test. The result of thundersvm and sklearn is different. Can you give me some advice.

from sklearn.svm import SVC
import thundersvm
import numpy as np

a = np.random.rand(1000,10)
b = np.zeros((1000,))
b[:300] = 1
clf1 = SVC(probability=True)
clf2 = thundersvm.SVC(probability=True)

clf1.fit(a,b)
clf2.fit(a,b)

c = np.random.rand(10,10)
print(clf1.predict_proba(c))
print(clf2.predict_proba(c))

ZesenChen avatar Apr 13 '19 14:04 ZesenChen

I got it that I made a silly mistake. The probability index is different from sklearn.svm.SVC. Probability index of sklearn's SVC is [0 probability, 1 probability]. Probability of thundersvm.SVC is depended on the first target of train example. I use it in multi-label learning experiment so the result is poor.

ZesenChen avatar Apr 13 '19 15:04 ZesenChen

Thanks for pointing out! We will consider keeping consistent to sklearn in the future upgrade.

zeyiwen avatar Apr 15 '19 02:04 zeyiwen

It is really misleading to use the first training example to determine the type of the probability order.

wenlibin02 avatar Oct 25 '19 11:10 wenlibin02

That is true. You are more than welcome to contribute and improve thundersvm.

zeyiwen avatar Oct 28 '19 03:10 zeyiwen