llama icon indicating copy to clipboard operation
llama copied to clipboard

weird outputs of 13B for unconditional generation

Open XinLiu-cs opened this issue 1 year ago • 4 comments

I execute the command in README for unconditional generation and do not change any hyper-parameters in example.py. The prompt I use is "Michael Jackson was tried for child sexual abuse allegations in 2005.", and the model continuation looks so weird, which is listed below.

Michael Michael Jackson was found Not Guilty By A Jury Of 12 On He W V Jaone to jest But L L A L i died Of A M D o t e r .
C photoed Rj Jacksonf o tt
 r a n s h i w L h a w a r i o r d I t d h o i t i l l n e r t c j c o d o t s d o t f n c y . f a e e y g
 n h m j c t a u s t t w p a o n n r t i g t r l e t o d p d e l l t n i a t h t u j a j c a y d u o s t y e r e d e t t s u p u e h r i n s t b a l o i o y t d a j o s s e a t w n n n u o a r w t c t e w o d y s n t l s n e t h n e d r a u d s s t m e t i s u d l e r o r d t f s t n i t t l i n t h t w o u t d h t f n t f g s d a r t o o t h i e n a o v s E A p S A c d g e R a G O N U P O O W N G C G I A L T E E R T O L P S I N O U O D
 I S R U N C I H O T G H L D E R B T O S T U R E T D E G O T L L A H E I I C C E D O O I N D P C A L E T Y R H L A L D O W L U T E L P N L N L A T I H N T O R N C O M E N P S E D M A N I N T O E L L Y E R P P N O S L A P C T H O R T C E T A G E R B L R Y N O T E P A R M W T W P E E C S L D A T H L T T E A T T I H E H R I P N L Y E A I C H O O N E M L E G A E E I T

The model is 13B. Is it a bug or a result of the sampling decoding?

XinLiu-cs avatar Mar 05 '23 03:03 XinLiu-cs

That's because he was guilty?

ghost avatar Mar 05 '23 04:03 ghost

I've also seen prompts too far from their provided examples devolve into some fairly random stuff, including the sort of "you you your you you you you" that I haven't seen since the LSTM days. It's happened with both 7B and 13B.

I think some of it may be the very simplified (compared to, e.g., HuggingFace) Generator class included with the example code.

I'm unfamiliar with Fairscale, so I'm learning more about that first. Once I do that, I hope to be able to get a better idea of what's happening.

jdwx avatar Mar 06 '23 01:03 jdwx

I had similar generations for multi-GPU runs. Setting random seeds made them coherent for me.

LucWeber avatar Mar 07 '23 22:03 LucWeber

I had similar generations for multi-GPU runs. Setting random seeds made them coherent for me.

Thank you so much! That solved the bug for me!

Ber666 avatar Apr 13 '23 22:04 Ber666

@LucWeber - interesting. Any idea why setting a random seed is important? My answers are incorrect otherwise. I am using random.sample elsewhere in the code to generate randomized model inputs.

srama2512 avatar Oct 06 '23 21:10 srama2512