sbi icon indicating copy to clipboard operation
sbi copied to clipboard

replace BatchNorm with LayerNorm

Open bkmi opened this issue 3 months ago • 2 comments

Is your feature request related to a problem? Please describe. BatchNorm eliminates the iid assumption. That assumption is at the core of all objective functions we use in this library.

Describe the solution you'd like We should switch to using LayerNorm or GroupNorm wherever possible.

Describe alternatives you've considered Give people a choice, but it's a niche issue and tbh I see no reason to use BatchNorm except legacy concerns.

Additional context Typically classifiers have the option in this format: use_batch_norm: bool = False, we should simply change that to use_layer_norm.

bkmi avatar Mar 19 '24 16:03 bkmi

@janfb wanted to see a side-by-side comparison of the two with LayerNorm and BatchNorm before making the change.

bkmi avatar Mar 21 '24 15:03 bkmi

To comment on that issue, with BatchNorm one should be very careful to always feed batches that represent "modes" in the same proportion at the original distribution. For example, for a binary classifier, it is invalid to evaluate the positive and negative batches separately during training as it therefore becomes enough to identify the class of a single element in the batch to decide the class of the entire batch.

francois-rozet avatar Mar 22 '24 12:03 francois-rozet