jetstream-pytorch
jetstream-pytorch copied to clipboard
Issues with prefill & generate
As reported by @tengomucho
Currently there are a few issues with prefill / generate implemention:
- Prefill does not use
self._sample
to do sampling. - Prefill returns a token, so first time generate calls it should return the second generated token, but now it returns the first token again. This is historical but quite unintuitive.