vowpal_wabbit icon indicating copy to clipboard operation
vowpal_wabbit copied to clipboard

Make flag --log_multi work together with --probabilities

Open fshabashev opened this issue 4 years ago • 6 comments

Short description

Right now vowpal wabbit can only output probabilities with --oaa or --csoaa_ldf=mc mult multiclass options.

Add the possibility to predict probabilities with --log_multi multiclass option on.

How this suggestion will help you/others

--log_multi is very useful when there are a lot of classes, because other options like --oaa are very slow in this case. But it is also important to predict probabilities, so users of the library could control prediction thresholds.

Possible solution/implementation details

Example/links if any

This code gives an error:

$ vw data.txt  --log_multi 1873  --probabilities -f model.vw
final_regressor = model.vw
Num weight bits = 27
learning rate = 0.5
initial_t = 0
power_t = 0.5
decay_learning_rate = 1
Error: unrecognised option '--probabilities'

While this code works, but it is slow: $ vw data.txt --oaa 1873 --probabilities -f model.vw

fshabashev avatar Oct 22 '20 13:10 fshabashev

There's something semantically unclear here---the log_multi reduction is made to make choices in time O(log(number of classes) ) but --probabilities inherently requires O(number of classes) computational time. So implementing something like this implies an exponential slowdown and some new algorithms be implemented.

Given the above, what were your thoughts here?

JohnLangford avatar Oct 22 '20 16:10 JohnLangford

yes, --probabilities require O(n_classes) but is it possible to output the max probability, so user of the library could know predicted probability. so user could know when prediction is made with high confidence (high predicted probability) or with low confidence. then in the application code user could mark all low confidence predictions as "Cannot classify with high confidence"

fshabashev avatar Oct 22 '20 18:10 fshabashev

There may be something that can be done here without damaging computational complexity. However, there is a complication: --log_multi trains individual nodes of the tree in a manner which is not probability consistent. See here: https://github.com/VowpalWabbit/vowpal_wabbit/blob/master/vowpalwabbit/log_multi.cc#L512 . (It is binary classification consistent.)

(Why? Because that's what the theory suggests is best for classification purposes.)

JohnLangford avatar Oct 23 '20 18:10 JohnLangford

for me returning some kind of a score is fine, it could be calibrated by the user

fshabashev avatar Oct 26 '20 15:10 fshabashev

Does --plt work for you instead? (This is a recent addition.) This provides sublinear prediction time for multilabel classification using a proper scoring rule, so the scores would make more sense.

JohnLangford avatar Oct 29 '20 16:10 JohnLangford

--plt looks very useful. but in my case I have different thresholds for different classes and looks like --plt cannot be used for that because it doesn't return probabilities directly

fshabashev avatar Nov 10 '20 21:11 fshabashev