transformers
transformers copied to clipboard
`resize_token_embeddings` in NLLB leading to empty outputs
System Info
-
transformersversion: 4.42.3 - Platform: Linux-4.18.0-513.11.1.el8_9.x86_64-x86_64-with-glibc2.28
- Python version: 3.10.14
- Huggingface_hub version: 0.23.4
- Safetensors version: 0.4.3
- Accelerate version: 0.33.0
- Accelerate config: not found
- PyTorch version (GPU?): 2.3.1+cu121 (False)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using distributed or parallel set-up in script?: yes
Who can help?
@arthur
Information
- [X] The official example scripts
- [ ] My own modified scripts
Tasks
- [X] An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - [ ] My own task or dataset (give details below)
Reproduction
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
model = AutoModelForSeq2SeqLM.from_pretrained("facebook/nllb-200-distilled-600M")
tokenizer = AutoTokenizer.from_pretrained("facebook/nllb-200-distilled-600M", additional_special_tokens=[f"code_{i}" for i in range(18)], use_fast=True)
model.resize_token_embeddings(len(tokenizer))
After resizing, generation using an official example:
article = "Şeful ONU spune că nu există o soluţie militară în Siria"
inputs = tokenizer(article, return_tensors="pt")
translated_tokens = model.generate(
**inputs, forced_bos_token_id=tokenizer.convert_tokens_to_ids("deu_Latn"), max_length=30
)
tokenizer.batch_decode(translated_tokens, skip_special_tokens=True)[0]
Output is: 't t t t t t t t t t'
Expected behavior
Generation should work without any errors. One interesting thing to note here is if I add just 2 new tokens, it works fine.