notebooks icon indicating copy to clipboard operation
notebooks copied to clipboard

Why max_input_length = 128 instead of 512 in examples/translation.ipynb??

Open Majdoddin opened this issue 2 years ago • 0 comments

@sgugger @lewtun Why the inputs are truncated at 128 tokens, although the model can take 512 tokens?

max_input_length = 128
model_inputs = tokenizer(inputs, max_length=max_input_length, truncation=True)

And this is the code to check the max_input_length of the model:

model_checkpoint = "Helsinki-NLP/opus-mt-en-ro"
model = AutoModelForSeq2SeqLM.from_pretrained(model_checkpoint)
print(model)

output:

MarianMTModel(
  (model): MarianModel(
    (shared): Embedding(59543, 512, padding_idx=59542)
    ...

Majdoddin avatar Apr 25 '23 09:04 Majdoddin