llama
llama copied to clipboard
weird outputs of 13B for unconditional generation
I execute the command in README for unconditional generation and do not change any hyper-parameters in example.py. The prompt I use is "Michael Jackson was tried for child sexual abuse allegations in 2005.", and the model continuation looks so weird, which is listed below.
Michael Michael Jackson was found Not Guilty By A Jury Of 12 On He W V Jaone to jest But L L A L i died Of A M D o t e r .
C photoed Rj Jacksonf o tt
r a n s h i w L h a w a r i o r d I t d h o i t i l l n e r t c j c o d o t s d o t f n c y . f a e e y g
n h m j c t a u s t t w p a o n n r t i g t r l e t o d p d e l l t n i a t h t u j a j c a y d u o s t y e r e d e t t s u p u e h r i n s t b a l o i o y t d a j o s s e a t w n n n u o a r w t c t e w o d y s n t l s n e t h n e d r a u d s s t m e t i s u d l e r o r d t f s t n i t t l i n t h t w o u t d h t f n t f g s d a r t o o t h i e n a o v s E A p S A c d g e R a G O N U P O O W N G C G I A L T E E R T O L P S I N O U O D
I S R U N C I H O T G H L D E R B T O S T U R E T D E G O T L L A H E I I C C E D O O I N D P C A L E T Y R H L A L D O W L U T E L P N L N L A T I H N T O R N C O M E N P S E D M A N I N T O E L L Y E R P P N O S L A P C T H O R T C E T A G E R B L R Y N O T E P A R M W T W P E E C S L D A T H L T T E A T T I H E H R I P N L Y E A I C H O O N E M L E G A E E I T
The model is 13B. Is it a bug or a result of the sampling decoding?
That's because he was guilty?
I've also seen prompts too far from their provided examples devolve into some fairly random stuff, including the sort of "you you your you you you you" that I haven't seen since the LSTM days. It's happened with both 7B and 13B.
I think some of it may be the very simplified (compared to, e.g., HuggingFace) Generator class included with the example code.
I'm unfamiliar with Fairscale, so I'm learning more about that first. Once I do that, I hope to be able to get a better idea of what's happening.
I had similar generations for multi-GPU runs. Setting random seeds made them coherent for me.
I had similar generations for multi-GPU runs. Setting random seeds made them coherent for me.
Thank you so much! That solved the bug for me!
@LucWeber - interesting. Any idea why setting a random seed is important? My answers are incorrect otherwise. I am using random.sample
elsewhere in the code to generate randomized model inputs.