gpt-fast icon indicating copy to clipboard operation
gpt-fast copied to clipboard

Mistake in 191 line if is_speculative=True generate.py ?

Open deafTim opened this issue 1 year ago • 1 comments

Am I right that here is a mistake? In 191 line generate.py Because for batch>1 cur_token will have more than 1 element so next_token.view(()) will give an error.

if is_speculative:
        input_pos = input_pos.item()  # for speculative decoding easier to keep on host
        while input_pos < T_new - 1:
            cur_token = next_token.view(())

            next_tokens = speculative_decode(
                model, draft_model, cur_token, input_pos, speculate_k, **sampling_kwargs
            )

deafTim avatar Oct 23 '24 16:10 deafTim