PainlessInferenceAcceleration icon indicating copy to clipboard operation
PainlessInferenceAcceleration copied to clipboard

lookahead with do_sample=True does not take temperature, top_k, top_p

Open learning-chip opened this issue 1 year ago • 2 comments

Here lookahead_generation doesn't take logits_warper as input:

https://github.com/alipay/PainlessInferenceAcceleration/blob/8015f12f7fe32acc102bb3eb51c4f8b3a420e79c/pia/lookahead/common/pretrained_model_batch.py#L426-L439

logits_warper is used in original sample to modify next_tokens_scores:

https://github.com/alipay/PainlessInferenceAcceleration/blob/8015f12f7fe32acc102bb3eb51c4f8b3a420e79c/pia/lookahead/common/pretrained_model_batch.py#L474-L486

and to modifies logits by temperature, top_k, top_p...

        if generation_config.temperature is not None and generation_config.temperature != 1.0:
            warpers.append(TemperatureLogitsWarper(generation_config.temperature))
        if generation_config.top_k is not None and generation_config.top_k != 0:
            warpers.append(TopKLogitsWarper(top_k=generation_config.top_k, min_tokens_to_keep=min_tokens_to_keep))
        if generation_config.top_p is not None and generation_config.top_p < 1.0:
            warpers.append(TopPLogitsWarper(top_p=generation_config.top_p, min_tokens_to_keep=min_tokens_to_keep))

https://github.com/huggingface/transformers/blob/09f9f566de83eef1f13ee83b5a1bbeebde5c80c1/src/transformers/generation/utils.py#L728-L733

This is not applied inside lookahead_generation. So with do_sample=True the temperature is always one

learning-chip avatar Apr 23 '24 18:04 learning-chip

Thank you, we will fix it soon.

zheyishine avatar May 05 '24 08:05 zheyishine

@zheyishine Hi, Has there been any progress?

snippetzero avatar Jun 05 '24 03:06 snippetzero