keras-io MAX_INPUT_LENGTH is automatically got set to 512 after fine-tuning even though I initially set to 1024

MAX_INPUT_LENGTH is automatically got set to 512 after fine-tuning even though I initially set to 1024

Open seungjun-green opened this issue 2 years ago • 1 comments

trafficstars

Following this tutorial, Abstractive Summarization with Hugging Face Transformers I created a text summarization ml model by fine-tuning t5-small with a custom dataset setting MAX_INPUT_LENGTH = 1024.

But if I try the model like this

from transformers import pipeline

summarizer = pipeline("summarization", model=model, tokenizer=tokenizer, framework="tf")

summarizer(
    raw_datasets["test"][0]["original"],
    min_length=MIN_TARGET_LENGTH,
    max_length=MAX_TARGET_LENGTH,
)

This is the result I got

Token indices sequence length is longer than the specified maximum sequence length for this model (655 > 512). Running this sequence through the model will result in indexing errors
[{'summary_text': 'The Pembina Trail was a 19th century trail used by Métis and European settlers to travel between Fort Garry and Fort Pemmbina in what is now the Canadian province of Manitoba and U.S. state of North Dakota. It was part of the larger Red River Trail network and is now a new version of it is now called the Lord Selkirk and Pembinea Highways in Manitoba. It is important because it allowed people to travel to and from the Red River for social or political reasons.'}]

But Why in above it saying the maximum sequence length for this model is 512 while initially I set it to 1024?

Apr 06 '23 22:04 seungjun-green

What model and tokenizer are you using?

Dec 04 '23 12:12 poolkit

Hi @seungjun-green, thanks for reporting this.

Could you provide a reproducible colab with the error you're facing to investigate this issue ?

Oct 10 '25 03:10 dhantule

This issue is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you.

Oct 25 '25 02:10 github-actions[bot]

This issue was closed because it has been inactive for 28 days. Please reopen if you'd like to work on this further.

Nov 08 '25 02:11 github-actions[bot]

Are you satisfied with the resolution of your issue? Yes No

Nov 08 '25 02:11 github-actions[bot]

keras-io keras-io copied to clipboard

MAX_INPUT_LENGTH is automatically got set to 512 after fine-tuning even though I initially set to 1024

keras-io
keras-io copied to clipboard