avalanche
avalanche copied to clipboard
Adding ExpertGate
I'm interested in adding ExpertGate to Avalanche. A few questions/thoughts I wanted to clarify before moving forward:
- While the original ExpertGate model uses a pretrained AlexNet, this limits the ExpertGate to whatever task PyTorch's AlexNet was pretrained for (I think its vision tasks?). Should this implementation allow one to feed whichever model they'd like? Or, am I grossly misunderstanding something here?
- I imagine ExpertGate would be implemented as a
Model? I can also view it as aStrategywhere you feed in your choice of model (see point above). Although, my gut says its should be aModel.
-
I agree. I would define a general component and leave AlexNet as the default. Keep in mind that we have the CL Baselines repo to host reproducible experiments.
-
You can check how we implement Progressive Neural Networks in Avalanche. Basically, we have a simple API to define dynamic modules that are expanded over time. Then, the ExpertGate strategy will be Naive strategy + an ExpertGate model.
I'm assigning this to you. Thanks for the help!
Reflecting on the first draft of the implementation, there are a couple of things I think I'm missing and wanted to pick your (anyone's) brain about:
- What triggers the change in the optimizer's parameters? I don't think the optimizer is updating its parameters and I know I don't have to manually do it (at least according to the documentation)
- What is the major difference between the
model_adaptationmethod and thebefore_training_expmethod. I imagine the former triggers something else (perhaps parameter updates for optimizer?)
- The optimizer is automatically reset after each experience in the
Supervisedtemplate (avalanche/training/templates/supervised.py). You can check themake_optimizermethod in the supervised template. In this way, if your model expands/changes its parameters after each experience, the change is taken into consideration by the newly created optimizer. This happens becausemodel_adaptationis called beforemake_optimizerin thebase_sgdtemplate from which thesupervisedtemplate inherits. - The
before_training_expis simply a callback that allows to execute some code before training on each experience.model_adaptationallows to dynamically modifies the model and in thebase_sgdtemplate it is called after each experience. The default version of this method simply calls theadaptationmethod of the model, if defined. This way, you can build custom models which do custom things during adaptation. One example is the multi head: onadaptationcall the model instantiates a new head for the incoming task.
Let me know if the interplay between the base_sgd and supervised templates is not clear or if you need further help in understanding what model_adaptation and make_optimizer do.
Thanks,
- I got a chance to look at the source file and I see where the optimizer get's updated!
- I see, thanks for the clarification
After I wrote my initial questions, I did some more documentation + source code digging and cleared up some of the doubts. Your comments + my reading helped me figure it out!