Results 26 comments of John

The current approach builds a tree over a discretized equidistant set of actions, then created a continuous function by randomizing with a kernel. The variable interval here doesn't map neatly...

What is your evaluation method here? With holdout_off and multiple passes, it looks like you are just using (essentially) the training performance, which of course is subject to overfitting. In...

The other obvious possibility is that your data source is nonstationary. Is that plausible? In particular, if you permute your training events and repeat the experiments, what happens?

Is the new model's perf worse than the old model's on the test set when both old and new are trained on permuted data? If so, it's interesting. If not,...

Predicting the probability of the action is about the only approach to solving the "gah, we didn't record the probabilities". However, my experience with this approach is that it's fairly...

There isn't a canned way to do it, but you could of course invoke VW twice to do this. Use multiclass prediction with probabilities, then do contextual bandits.

Matt, it looks like there is some significant 32bit vs. 64bit weirdness in the LDA code. Do you know what it is? -John On 06/03/2011 05:06 PM, yarikoptic wrote: >...

Updated. Minor differences in floating point numbers should be expected, because we use -ffast-math when compiling. -John On Wed, Jun 29, 2011 at 7:45 PM, Matt Hoffman [email protected]: > Figured...

The easiest reasonable solution seems to be creating a custom learning-to-search application. To do this, you would modify and/or create a new task (like those here: https://github.com/VowpalWabbit/vowpal_wabbit/blob/master/vowpalwabbit/core/src/reductions/search/search_sequencetask.cc or in the...

This may be a partial answer: Inside CCB, the system needs to actually choose an action according to its distribution in order to later be able to do a CB...