vowpal_wabbit
vowpal_wabbit copied to clipboard
Make flag --log_multi work together with --probabilities
Short description
Right now vowpal wabbit can only output probabilities with --oaa
or --csoaa_ldf=mc mult
multiclass options.
Add the possibility to predict probabilities with --log_multi
multiclass option on.
How this suggestion will help you/others
--log_multi
is very useful when there are a lot of classes, because other options like --oaa
are very slow in this case.
But it is also important to predict probabilities, so users of the library could control prediction thresholds.
Possible solution/implementation details
Example/links if any
This code gives an error:
$ vw data.txt --log_multi 1873 --probabilities -f model.vw
final_regressor = model.vw
Num weight bits = 27
learning rate = 0.5
initial_t = 0
power_t = 0.5
decay_learning_rate = 1
Error: unrecognised option '--probabilities'
While this code works, but it is slow:
$ vw data.txt --oaa 1873 --probabilities -f model.vw
There's something semantically unclear here---the log_multi reduction is made to make choices in time O(log(number of classes) ) but --probabilities inherently requires O(number of classes) computational time. So implementing something like this implies an exponential slowdown and some new algorithms be implemented.
Given the above, what were your thoughts here?
yes, --probabilities require O(n_classes) but is it possible to output the max probability, so user of the library could know predicted probability. so user could know when prediction is made with high confidence (high predicted probability) or with low confidence. then in the application code user could mark all low confidence predictions as "Cannot classify with high confidence"
There may be something that can be done here without damaging computational complexity. However, there is a complication: --log_multi trains individual nodes of the tree in a manner which is not probability consistent. See here: https://github.com/VowpalWabbit/vowpal_wabbit/blob/master/vowpalwabbit/log_multi.cc#L512 . (It is binary classification consistent.)
(Why? Because that's what the theory suggests is best for classification purposes.)
for me returning some kind of a score is fine, it could be calibrated by the user
Does --plt work for you instead? (This is a recent addition.) This provides sublinear prediction time for multilabel classification using a proper scoring rule, so the scores would make more sense.
--plt looks very useful. but in my case I have different thresholds for different classes and looks like --plt cannot be used for that because it doesn't return probabilities directly