EAGLE How to handle embedding layernorm

How to handle embedding layernorm

Open xiongqisong opened this issue 7 months ago • 10 comments

some model may do layernorm after embedding, then send it to Attention layers, when face this type model, do i need to add embedding layernorm to eagle or any other trick which i need do to make eagle output right tokens. I don't know why need -2 when generate train data in llama, and how to change the -2 in myown ge_data script to other model, from now on, i try not -2 at generrate data, and add embedding layernorm or not for training eagle, both don't make good result in parallel decoding, i'm confused, the Model is BluLM-7B-Chat, thanks for helping me!

Jul 09 '24 09:07 xiongqisong

EAGLE EAGLE copied to clipboard

How to handle embedding layernorm

EAGLE
EAGLE copied to clipboard