notebooks
notebooks copied to clipboard
Why max_input_length = 128 instead of 512 in examples/translation.ipynb??
@sgugger @lewtun Why the inputs are truncated at 128 tokens, although the model can take 512 tokens?
max_input_length = 128
model_inputs = tokenizer(inputs, max_length=max_input_length, truncation=True)
And this is the code to check the max_input_length of the model:
model_checkpoint = "Helsinki-NLP/opus-mt-en-ro"
model = AutoModelForSeq2SeqLM.from_pretrained(model_checkpoint)
print(model)
output:
MarianMTModel(
(model): MarianModel(
(shared): Embedding(59543, 512, padding_idx=59542)
...