disco
disco copied to clipboard
Fix and rework GPT-TF.js
Addresses #654
- Fix weight initialization from zero to random uniform
- Implement weight sharing between token embeddings and the language modeling head
- Improve generation with top k sampling option
- Add seed for deterministic runs
- Implement text loaders by byte chunk rather than by line which doesn't require to pad each line to the context length