tribuo icon indicating copy to clipboard operation
tribuo copied to clipboard

IncrementalTrainer : Example or guide?

Open C-Compton opened this issue 3 years ago • 1 comments

Ask the question I'm looking to reinforce an existing model. It looks like this interface is designed for this behavior. However, I'm at a loss as to how I should implement the methods that need to be overridden.

Thanks!

C-Compton avatar Jan 12 '22 19:01 C-Compton

Unfortunately the IncrementalTrainer interface is not used anywhere in Tribuo at the moment. We added it with the intention of supporting online learning/retraining where the feature and output domains didn't change, but I think our users probably want to be able to change both of those (see #45). Designing something that will allow that while keeping up with Tribuo's internal invariants on feature numbering is tricky and we've not figured it out yet.

Adding support for it to something like LinearSGDModel will be the easiest path, though we'll need to modify the training code to store the gradient optimizer parameters in models (which they currently don't store) to allow resuming training. As this changes the format on disk, we're going to work on the protobuf based serialization mechanism first, which will give us the necessary flexibility in serialization. There are also open questions about how to thread through the model & data provenance when models can be retrained multiple times, as the model provenance is no longer immutable.

In our internal deployments when we need to deal with drift we've used the provenance system to figure out exactly what the model was and then train a fresh one completely from scratch on the new data (and the old data if necessary) using the hyperparameters and input pipeline recovered from the provenance. This works for all models, whereas incremental training isn't really possible with the tree algorithms we have implemented, or those available in XGBoost.

Craigacp avatar Jan 12 '22 22:01 Craigacp