John
John
For contextual bandit training generally, you do not need the "actual predictions" to update the model---instead you need the chosen action and the probability with which that action was taken....
I have this reproducing on the commandline, and it does seem like an inconsistency in reporting. I'll try to find some time with Jack to work through the precise source.
Jack and I traced this down. The issue is that "--all_slots_loss" isn't being used when running on the 3rd example (and it's not being carried over in the file). If...
Can you quantify how much memory the model is using as a function of the number of classes? If only a small number of the parameters are non zero you...
Closing for now, but reopen if you want to pursue.
W.r.t. (1), I'm not surprised to see that retraining tends to be helpful. Online learning algorithms are, to some extent, designed to forget the past in the process of adapting...