DoLa icon indicating copy to clipboard operation
DoLa copied to clipboard

Should apply model.norm layer to hidden_states[early_exit_layer] ?

Open githubhyz opened this issue 2 years ago • 1 comments

https://github.com/voidism/DoLa/blob/dc88907406f9744f748f3c779f2353efd5bdc824/transformers-4.28.1/src/transformers/models/llama/modeling_llama.py#L703

I think you guys should apply model.norm layer to hidden_states[early_exit_layer] . Because only the last hidden_state applied model.norm layer. See https://github.com/voidism/DoLa/blob/dc88907406f9744f748f3c779f2353efd5bdc824/transformers-4.28.1/src/transformers/models/llama/modeling_llama.py#L594

githubhyz avatar Nov 16 '23 07:11 githubhyz

This makes sense, but when apply your suggestion, the accuracy go down in GSM8k dataset, have no idea why

pphuc25 avatar Dec 21 '23 13:12 pphuc25