imitation icon indicating copy to clipboard operation
imitation copied to clipboard

Remove normalize_input_layer default

Open PavelCz opened this issue 2 years ago • 3 comments

Bug description

I'm trying to create a named config for a CnnRewardNet. AFAICT this is currently not possible for scripts that use the reward ingredient (such as train_preference_comparisons.py, due to this config hook.

I can only change the normalize_input_layer net_kwarg, not remove, it since it will be added back by the hook. The problem is that build_cnn in CnnRewardNet doesn't expect this net_kwarg.

I suggest either

  • removing this default value from the hook,
  • only applying it when net_cls is not image-based (?),
  • or making it a requirement, that every reward net that uses the net_kwargs accepts the normalize_input_layer net_kwarg. This should probably be documented somewhere and CnnRewardNet modified accordingly.

Steps to reproduce

This is what I'd like to put into my named config:

    reward = dict(
        net_cls=imitation.rewards.reward_nets.CnnRewardNet,
    )

However, this also doesn't work:

    reward = dict(
        net_cls=imitation.rewards.reward_nets.CnnRewardNet,
        net_kwargs={},
    )

This can be easily used in a wrapper around such as this:

import imitation.rewards.reward_nets
from imitation.scripts.config.train_preference_comparisons import (
    train_preference_comparisons_ex,
)
from imitation.scripts.train_preference_comparisons import main_console

@train_preference_comparisons_ex.named_config
def cnn():
    reward = dict(
        net_cls=imitation.rewards.reward_nets.CnnRewardNet,
        net_kwargs={},
    )
    locals()  # make flake8 happy

if __name__ == "__main__": 
    main_console()

Environment

Tested with newest commit to date 40a2a559706e50bf60d7cc388a2c36dd0d4e8619

PavelCz avatar Dec 19 '22 17:12 PavelCz

Thanks for the report. Always adding normalize_input_layer does seem like busted behavior, may not make sense for a variety of reward networks and input types.

I'd suggest modifying the config hook to only apply if the network is an instance of BasicRewardNet or BasicShapedRewardNet. It's still a bit of a gotcha for users (define a new reward network and normalization suddenly disappears), but having normalization turned on by default does seem desirable given how big of a difference it can make.

AdamGleave avatar Dec 21 '22 03:12 AdamGleave

One could also change the default here, though that wouldn't be visible in the sacred config if we remove the hook. Maybe non-trivial defaults like this should always be visible in the config?

PavelCz avatar Dec 21 '22 07:12 PavelCz

One could also change the default here, though that wouldn't be visible in the sacred config if we remove the hook. Maybe non-trivial defaults like this should always be visible in the config?

Yeah, I'd rather keep this visible in the config.

AdamGleave avatar Dec 21 '22 23:12 AdamGleave