`resize_token_embeddings` in NLLB leading to empty outputs

Open bhavitvyamalik opened this issue 1 year ago • 0 comments

System Info

transformers version: 4.42.3
Platform: Linux-4.18.0-513.11.1.el8_9.x86_64-x86_64-with-glibc2.28
Python version: 3.10.14
Huggingface_hub version: 0.23.4
Safetensors version: 0.4.3
Accelerate version: 0.33.0
Accelerate config: not found
PyTorch version (GPU?): 2.3.1+cu121 (False)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using distributed or parallel set-up in script?: yes

Who can help?

@arthur

Information

[X] The official example scripts
[ ] My own modified scripts

Tasks

[X] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[ ] My own task or dataset (give details below)

Reproduction

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

model = AutoModelForSeq2SeqLM.from_pretrained("facebook/nllb-200-distilled-600M")
tokenizer = AutoTokenizer.from_pretrained("facebook/nllb-200-distilled-600M", additional_special_tokens=[f"code_{i}" for i in range(18)], use_fast=True)
model.resize_token_embeddings(len(tokenizer))

After resizing, generation using an official example:

article = "Şeful ONU spune că nu există o soluţie militară în Siria"
inputs = tokenizer(article, return_tensors="pt")

translated_tokens = model.generate(
    **inputs, forced_bos_token_id=tokenizer.convert_tokens_to_ids("deu_Latn"), max_length=30
)
tokenizer.batch_decode(translated_tokens, skip_special_tokens=True)[0]

Output is: 't t t t t t t t t t'

Expected behavior

Generation should work without any errors. One interesting thing to note here is if I add just 2 new tokens, it works fine.

Aug 22 '24 19:08 bhavitvyamalik