llama rotary position embedding cause different output in different tensor parallel settings!

rotary position embedding cause different output in different tensor parallel settings!

Open marscrazy opened this issue 1 year ago • 1 comments

Thanks for your great work in LLM. I have tried to load llama-13b in different mp size settings, e.g., 2,4. However, the output embedding and generated sentence changes with the change of mp settings.

My question: Is this normal?

mp size = 4

mp size = 2

Mar 16 '23 03:03 marscrazy

The -3.8359 is the mean of output embedding and 1.9458 is the std with mp size =4. the mean and std is changed when mp size=2

Mar 16 '23 05:03 marscrazy

llama llama copied to clipboard

rotary position embedding cause different output in different tensor parallel settings!

llama
llama copied to clipboard