disco icon indicating copy to clipboard operation
disco copied to clipboard

gpt-tfjs only repeats the last prompt token

Open JulienVig opened this issue 1 year ago • 0 comments

Here are some issues with gpt-tfjs I noted while implementing tokenization:

  • [x] There is a memory leak in the training loop. The memory doesn't grow much (~0.01MB per iteration) but the number of tensors keep growing (+14 new tensors allocated per dataset batch)
  • [x] A trained model (e.g. on wikitext with iteration>1000, validation perplexity<4) almost always repeats the last prompt token
  • [x] Create a test case for the wikitext task

JulienVig avatar Apr 03 '24 11:04 JulienVig