gpt-fast
                                
                                 gpt-fast copied to clipboard
                                
                                    gpt-fast copied to clipboard
                            
                            
                            
                        Mistake in 191 line if is_speculative=True generate.py ?
Am I right that here is a mistake? In 191 line generate.py Because for batch>1 cur_token will have more than 1 element so next_token.view(()) will give an error.
if is_speculative:
        input_pos = input_pos.item()  # for speculative decoding easier to keep on host
        while input_pos < T_new - 1:
            cur_token = next_token.view(())
            next_tokens = speculative_decode(
                model, draft_model, cur_token, input_pos, speculate_k, **sampling_kwargs
            )