Nathan Tam
Nathan Tam
**The bug** Some special tokens have ids that are out of the vocab size in transformers, this can happen with fine-tuned models with extra added special tokens to the original...
### Your current environment ```text PyTorch version: 2.3.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS: Ubuntu 22.04.4 LTS (x86_64) GCC...
1. Add two new data classes called CacheHistory and StepOutput for storing cache history along with the token history 2. Add the option to return cache in "generate" and "stream_generate"...