llama weird outputs of 13B for unconditional generation

I execute the command in README for unconditional generation and do not change any hyper-parameters in example.py. The prompt I use is "Michael Jackson was tried for child sexual abuse allegations in 2005.", and the model continuation looks so weird, which is listed below.

Michael Michael Jackson was found Not Guilty By A Jury Of 12 On He W V Jaone to jest But L L A L i died Of A M D o t e r .
C photoed Rj Jacksonf o tt
 r a n s h i w L h a w a r i o r d I t d h o i t i l l n e r t c j c o d o t s d o t f n c y . f a e e y g
 n h m j c t a u s t t w p a o n n r t i g t r l e t o d p d e l l t n i a t h t u j a j c a y d u o s t y e r e d e t t s u p u e h r i n s t b a l o i o y t d a j o s s e a t w n n n u o a r w t c t e w o d y s n t l s n e t h n e d r a u d s s t m e t i s u d l e r o r d t f s t n i t t l i n t h t w o u t d h t f n t f g s d a r t o o t h i e n a o v s E A p S A c d g e R a G O N U P O O W N G C G I A L T E E R T O L P S I N O U O D
 I S R U N C I H O T G H L D E R B T O S T U R E T D E G O T L L A H E I I C C E D O O I N D P C A L E T Y R H L A L D O W L U T E L P N L N L A T I H N T O R N C O M E N P S E D M A N I N T O E L L Y E R P P N O S L A P C T H O R T C E T A G E R B L R Y N O T E P A R M W T W P E E C S L D A T H L T T E A T T I H E H R I P N L Y E A I C H O O N E M L E G A E E I T

The model is 13B. Is it a bug or a result of the sampling decoding?

Mar 05 '23 03:03 XinLiu-cs

That's because he was guilty?

Mar 05 '23 04:03 ghost

I've also seen prompts too far from their provided examples devolve into some fairly random stuff, including the sort of "you you your you you you you" that I haven't seen since the LSTM days. It's happened with both 7B and 13B.

I think some of it may be the very simplified (compared to, e.g., HuggingFace) Generator class included with the example code.

I'm unfamiliar with Fairscale, so I'm learning more about that first. Once I do that, I hope to be able to get a better idea of what's happening.

Mar 06 '23 01:03 jdwx

I had similar generations for multi-GPU runs. Setting random seeds made them coherent for me.

Mar 07 '23 22:03 LucWeber

I had similar generations for multi-GPU runs. Setting random seeds made them coherent for me.

Thank you so much! That solved the bug for me!

Apr 13 '23 22:04 Ber666

@LucWeber - interesting. Any idea why setting a random seed is important? My answers are incorrect otherwise. I am using random.sample elsewhere in the code to generate randomized model inputs.

Oct 06 '23 21:10 srama2512

llama llama copied to clipboard

weird outputs of 13B for unconditional generation

llama
llama copied to clipboard