disco
disco copied to clipboard
gpt-tfjs only repeats the last prompt token
Here are some issues with gpt-tfjs I noted while implementing tokenization:
- [x] There is a memory leak in the training loop. The memory doesn't grow much (~0.01MB per iteration) but the number of tensors keep growing (+14 new tensors allocated per dataset batch)
- [x] A trained model (e.g. on wikitext with iteration>1000, validation perplexity<4) almost always repeats the last prompt token
- [x] Create a test case for the wikitext task