adapter-bert
adapter-bert copied to clipboard
freezing "layer_norm" and "head"
Hi Could you confirm in the implementation of adapters, if layer_norm of the original model should be unfreezed? or only layer_norm inside adapter needs to be unfreezed? How about the classifier's head? Does it need to be freezed? thanks
Unfreeze all the layer norms (shouldn't matter too much) and the classifier head (will likely matter a lot). These correspond to a small number of parameters compared to the adapters.
Hi Neil thanks for the response. when I unfreeze layer norm of the model (not adapter), I get 20 percent decrease in accuracy, I am not sure what is the correct implementation for pre-norm language model cases, for this, I made a separate issue, any suggestion is appreciated. thanks