transformers icon indicating copy to clipboard operation
transformers copied to clipboard

BART generate with min_new_tokens exceeds maximum length

Open vsocrates opened this issue 9 months ago • 0 comments

System Info

  • transformers version: 4.40.2
  • Platform: Linux-4.18.0-477.36.1.el8_8.x86_64-x86_64-with-glibc2.28
  • Python version: 3.10.14
  • Huggingface_hub version: 0.23.0
  • Safetensors version: 0.4.3
  • Accelerate version: 0.30.1
  • Accelerate config: not found
  • PyTorch version (GPU?): 2.3.0 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using GPU in script?: Yes
  • Using distributed or parallel set-up in script?: No

Who can help?

@ArthurZucker @younesbelkada @gante

Information

  • [ ] The official example scripts
  • [X] My own modified scripts

Tasks

  • [ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • [X] My own task or dataset (give details below)

Reproduction

If I load a fine-tuned BARTforConditionalGeneration model and then try to generate text with it, I run into the following error: This is a friendly reminder - the current text generation call will exceed the model's predefined maximum length (1024). Depending on the model, you may observe exceptions, performance degradation, or nothing at all.

Generation code:

outputs = model.generate(input_ids, attention_mask=attention_mask, num_beams=3, 
                         min_new_tokens=1500,
                         max_new_tokens=2500,
                         # stopping_criteria=stopping_criteria,
                         early_stopping=True)

I was under the impression that since the BART decoder generates autoregressively, there was no limit to its generation?

Expected behavior

Generation without a CUDA or out-of-bounds error with arbitrary length.

vsocrates avatar May 11 '24 13:05 vsocrates