The probability from sklearn is different from thundersvm
Hello, I try to use thundersvm to replace SVC in sklearn, but I found that the probability result is a little poor (but the binary classification is as good as sklearn). And I write the code as follows to have a test. The result of thundersvm and sklearn is different. Can you give me some advice.
from sklearn.svm import SVC
import thundersvm
import numpy as np
a = np.random.rand(1000,10)
b = np.zeros((1000,))
b[:300] = 1
clf1 = SVC(probability=True)
clf2 = thundersvm.SVC(probability=True)
clf1.fit(a,b)
clf2.fit(a,b)
c = np.random.rand(10,10)
print(clf1.predict_proba(c))
print(clf2.predict_proba(c))
I got it that I made a silly mistake. The probability index is different from sklearn.svm.SVC. Probability index of sklearn's SVC is [0 probability, 1 probability]. Probability of thundersvm.SVC is depended on the first target of train example. I use it in multi-label learning experiment so the result is poor.
Thanks for pointing out! We will consider keeping consistent to sklearn in the future upgrade.
It is really misleading to use the first training example to determine the type of the probability order.
That is true. You are more than welcome to contribute and improve thundersvm.