keras-io icon indicating copy to clipboard operation
keras-io copied to clipboard

Text generation with a miniature GPT : can't deal with newlines?

Open notnot opened this issue 1 year ago • 0 comments

I'm seeing poor performance when using this code with text data other than the default IMDb . It seems that the model can't deal with newlines properly: it gets stuck generating '0' tokens after outputting a newline. Token 0 is the empty string ''... In the IMDb case, the lines are very long, and this is probably why it works reasonably well. But even there, if you increase the number of words to generate say 100 words, you see that the transformer has a lot of problems, almost always getting stuck in repetitive loops.

notnot avatar Feb 23 '23 00:02 notnot