avalanche Adding ExpertGate

I'm interested in adding ExpertGate to Avalanche. A few questions/thoughts I wanted to clarify before moving forward:

While the original ExpertGate model uses a pretrained AlexNet, this limits the ExpertGate to whatever task PyTorch's AlexNet was pretrained for (I think its vision tasks?). Should this implementation allow one to feed whichever model they'd like? Or, am I grossly misunderstanding something here?
I imagine ExpertGate would be implemented as a Model? I can also view it as a Strategy where you feed in your choice of model (see point above). Although, my gut says its should be a Model.

May 12 '22 12:05 niniack

I agree. I would define a general component and leave AlexNet as the default. Keep in mind that we have the CL Baselines repo to host reproducible experiments.
You can check how we implement Progressive Neural Networks in Avalanche. Basically, we have a simple API to define dynamic modules that are expanded over time. Then, the ExpertGate strategy will be Naive strategy + an ExpertGate model.

May 12 '22 13:05 AntonioCarta

I'm assigning this to you. Thanks for the help!

May 12 '22 13:05 AntonioCarta

Reflecting on the first draft of the implementation, there are a couple of things I think I'm missing and wanted to pick your (anyone's) brain about:

What triggers the change in the optimizer's parameters? I don't think the optimizer is updating its parameters and I know I don't have to manually do it (at least according to the documentation)
What is the major difference between the model_adaptation method and the before_training_exp method. I imagine the former triggers something else (perhaps parameter updates for optimizer?)

Jun 04 '22 06:06 niniack

The optimizer is automatically reset after each experience in the Supervised template (avalanche/training/templates/supervised.py). You can check the make_optimizer method in the supervised template. In this way, if your model expands/changes its parameters after each experience, the change is taken into consideration by the newly created optimizer. This happens because model_adaptation is called before make_optimizer in the base_sgd template from which the supervised template inherits.
The before_training_exp is simply a callback that allows to execute some code before training on each experience. model_adaptation allows to dynamically modifies the model and in the base_sgd template it is called after each experience. The default version of this method simply calls the adaptation method of the model, if defined. This way, you can build custom models which do custom things during adaptation. One example is the multi head: on adaptation call the model instantiates a new head for the incoming task.

Let me know if the interplay between the base_sgd and supervised templates is not clear or if you need further help in understanding what model_adaptation and make_optimizer do.

Jun 04 '22 09:06 AndreaCossu

Thanks,

I got a chance to look at the source file and I see where the optimizer get's updated!
I see, thanks for the clarification

After I wrote my initial questions, I did some more documentation + source code digging and cleared up some of the doubts. Your comments + my reading helped me figure it out!

Jun 16 '22 09:06 niniack

avalanche avalanche copied to clipboard

Adding ExpertGate

avalanche
avalanche copied to clipboard