vowpal_wabbit
vowpal_wabbit copied to clipboard
Incremental Training Best Practice
Description
I believe the ease of incremental training is one highlight of VW. However, the incremental training best practice is not obvious on the documentation page. (Please kindly point me to the right page if there is already one.)
I am looking for answers to a couple of questions here. I am using VW 8.6.1
.
- I try to incrementally train my model with new labels and corresponding new features (Most new labels and features do not overlap with the trained data). However, as I add more labels, the model F1-Scores drops significantly. I had to retrain the model using all the data the model had seen to improve the F1 scores. Is this an expected way to do incremental training when introducing new labels? As shown in the fig,
no data replay
indicates the F1-scores without retraining, andwith data replay
indicates with retraining. - I was using the
csoaa
reduction, and the documentation says I should specify the number of labels before training. However, it seems the incremental training step can add new classifiers for the newly introduced labels, as shown by the high F1 scores I got. Is this expected behavior or a bug?
Any feedback is appreciated. Thank you!
W.r.t. (1), I'm not surprised to see that retraining tends to be helpful. Online learning algorithms are, to some extent, designed to forget the past in the process of adapting to the present.
W.r.t. (2), there are two different notions of csoaa: one where you need to specify the label up front and one where you specify a different set of features for each of a variable set of labels. Which do you have in mind? (What are the exact flags?)