transformers icon indicating copy to clipboard operation
transformers copied to clipboard

StoppingCriteria for Repetition

Open Vipitis opened this issue 1 year ago • 3 comments

Feature request

similar to repetition_penalty for generation config, but as a stopping criteria.

Motivation

(small?) models tend to generated endless loops of the same few tokens, or a combination where they only increase like a single digit. (could not find any similar FRs)

I run into this quite a lot when doing evaluation runs (with greedy decoding) for code completion tasks. here is a screenshot of multiple generations saved to a file. the blocks of repetition can easily be spotted. image

Having a stopping criterion that detects such behaviour would massively speed up evaluation runs, since generation could stop early and not reach the max_new_token set. Some parameters might be helpful to expose like number of repetitions, and n-gram overlap for example.

Your contribution

I am happy to contribute with a PR myself, but will not find the time to do so in the next ~6-8 weeks. It doesn't look straight forward, but I am also not too familiar with the deeper parts of the generation code - so it might take me a while.

Vipitis avatar Aug 20 '24 15:08 Vipitis

Hey @Vipitis ! A small question, can we use only repetition penalty to prevent this instead of forcing to stop when an ngram is repeated?

cc @gante

zucchini-nlp avatar Aug 21 '24 04:08 zucchini-nlp

can we use only repetition penalty

Likely yes in most practical settings. But in the case of runnings generation for eval benchmarks, some require greedy decoding. Using any of the generation config parameters impacts the tokens you decode, therefore adding variables to the experiment. Stopping criteria is just that, it stops early when the generation has already failed. There is a non zero chance that the model somehow recovers and still completes a valid function, but I have not observed that yet.

Vipitis avatar Aug 21 '24 07:08 Vipitis

Hey @Vipitis 👋 Thank you for opening this issue!

It does make sense to save compute cycles when we have high confidence that the output won't improve -- repetition is one of those cases. I'd gladly accept a PR that adds that StoppingCriteria :)

(suggestion: we can add a stop_at_repeated_ngram_size flag, or something similar, to GenerationConfig)

gante avatar Aug 21 '24 18:08 gante