avalanche
avalanche copied to clipboard
Improvement to BaseStrategy
1. Experience reference
It might be useful, from inside the before_training()
(and before_eval()
) method of StrategyPlugin
to have access to the current scenario. This would allow to do variable initializations such as:
def before_training(self, strategy: 'BaseStrategy', **kwargs):
nb_cl = np.unique(strategy.experience.scenario.n_classes_per_exp).item()
self.t = torch.zeros((100, nb_cl))
Currently, to reproduce the code above it's necessary to either compute and pass nb_cl
as an argument to StrategyPlugin constructor or to access it inside another method such as before_training_exp()
using a condition to initialize it only once.
I suggest to modify base strategy such that a reference to the first experience is initialized before before_training()
2. Scenario reference
I found that, most of the time I access strategy.experience
is to get to strategy.experience.scenario
. To reduce the length of statements such as
strategy.experience.scenario.classes_in_exp_range(0, tid)
It might be useful to keep inside of base strategy a reference to the current experience scenario
3. before_training()
behavior
When a strategy is trained as follow:
for exp in scenario.train_stream:
train_metrics = cl_strategy.train(exp, num_workers=4)
The plugin method before_training()
is called before every experience.
When is trained as:
train_metrics = cl_strategy.train(scenario.train_stream, num_workers=4)
before_training()
is called just once before every experience.
I think it would be less ambiguous for whoever needs to implement a plugin, for the behavior to be, in both cases, consistent with the one of the second example. This could be regulated by a BaseStrategy argument.
4. Learning Rate Scheduling Support
At the moment, to use PyTorch learning rate scheduler, it's necessary to pass it to StrategyPlugin
and use it inside of after_training_epoch()
. I think this would better be handled by the strategy itself and not by the plugin. I see two possible solutions:
- Just have the BaseStrategy constructor take a scheduler parameter and call
scheduler.step()
after each epoch. - Create some sort of wrapper to join scheduler and optimizer together such that a call to its
step()
method would calloptimizer.step()
and, when required,scheduler.step()
. This would be transparent with respect toBaseStrategy
. Also, with a small modification to thereset_optimizer()
method, this would allow to easily reset the optimizer learning rate before each experience. Problem: difficult to determine when to callscheduler.step()
- Simply implement a separate
StartegyPlugin
to handle this (probably the best solution). The distinction between aStrategyPlugin
that implement a CL algorithm and aStrategyPlugin
that implement simple utility should be made clear (either by creating aUtilityPlugin
class or grouping them under a certain import path)
If you think that any of the three points is an actual issue I'd like to submit a pull request to solve it .
Thanks @andrew-r96 for the great feedback and possible integrations, I guess @AntonioCarta is the best person to give you feedback on this. Please refer also to a related discussion here: https://github.com/ContinualAI/avalanche/discussions/563 on LR scheduling.
Thanks for the proposal @andrew-r96.
1. Experience reference
This is a good point. However, there are some caveats to keep in mind. In the more general setting, the scenario represents a stream. Most of these streams are created statically because this is what happens in the literature, and therefore you are able to know stream properties such as the number of classes. This is a simplification that exists in the literature but not the real world. We may have streams in the future where the data is not available in advance.
I think Avalanche should try to provide a model that is as general as possible, including not assuming that information is available where it shouldn't (such as the number of classes). We already had some discussion on this exact topic with some other people and they were strongly against any assumption of knowledge about future experiences. In fact, I think most people that gave feedback would like Avalanche to be much more stringent than it already is.
Currently, to reproduce the code above it's necessary to either compute and pass nb_cl as an argument to StrategyPlugin constructor or to access it inside another method such as before_training_exp() using a condition to initialize it only once.
I think most of the time you should use before_training_exp
. Have you seen the new IncrementalClassifier
? you never need to know the number of classes in advance.
2. Scenario reference
You are right, this is a really common usecase. The scenario is used in many different strategies. If we initialize it early this will probably solve also your problem 1.
3. before_training() behavior
That is exactly why you almost always want to call before_training_exp
. I don't think many strategies really need to execute code before_training
, the callback is there mostly for metrics/evaluation.
This could be regulated by a BaseStrategy argument.
This solution doesn't really work. If you implement a plugin with a specific behavior in mind you can't let the BaseStrategy change the behavior or you will probably break (silently) any plugin using before_training
.
4. Learning Rate Scheduling Support
See the link above and let me know if that solves your problem. ;-)
Thanks for the detailed answer! I agree that a good CL algorithm should avoid relying to information that wouldn't be available in a real-world scenario, I'll try to remove those assumption in my code. I feel that Avalanche objective is to aid the development of CL algorithms and not to enforce the fairness of the scenario (as that's up to the reviewers). Though, putting effort to make it easier to access information that the algorithm shouldn't need could be a waste of time.
This solution doesn't really work. If you implement a plugin with a specific behavior in mind you can't let the BaseStrategy change the behavior or you will probably break (silently) any plugin using before_training
I absolutely agree, I proposed that cause it felt more explicit than the current behavior. Shouldn't the result be the same whether I execute
for exp in scenario.train_stream:
train_metrics = cl_strategy.train(exp, num_workers=4)
or
train_metrics = cl_strategy.train(scenario.train_stream, num_workers=4)
?
I feel that Avalanche objective is to aid the development of CL algorithms and not to enforce the fairness of the scenario (as that's up to the reviewers). Though, putting effort to make it easier to access information that the algorithm shouldn't need could be a waste of time.
I agree with you on this point. My main concern is on the usability of the library by its users. I think the best solution is to: 1 - Provide the users with number of classes and other attributes, which may be helpful. I think we already do this. 2 - Users can do whatever they want inside the strategy, including "cheating". This is often useful to simplify the implementation or for same particular strategies (e.g. Cumulative) which have different assumptions. 3 - strategies provided by Avalanche off-the-shelf should be as general as possible. This means that sometimes we need a little more work to support a large number of settings (multi-task & single-task, dynamic models, arbitrary mini-batches shapes, ...).
However, I think most of the time good libraries are easier because they force you to make good choices, even when they restrict you in some way that may not be obvious to the end user. The current plugin's implementation gives you complete freedom, which means you have to make a lot of choices. I think this is the main issue. That's why instead of adding more options I would like to drastically cut the complexity of the current API.
Shouldn't the result be the same whether I execute
It's almost never the same because these callbacks modify the state. For example, stream-level metrics exploit before/after_training
.
This is a big problem. The only solution I see is to completely remove the callbacks before_training
and after_training
. It seems that they are only causing problems. #493 is an example.