transformers icon indicating copy to clipboard operation
transformers copied to clipboard

BlenderbotSmall incorrect usage of start and end tokens

Open xenova opened this issue 2 years ago • 0 comments

System Info

  • transformers version: 4.27.2
  • Platform: Windows-10-10.0.19041-SP0
  • Python version: 3.8.3
  • Huggingface_hub version: 0.12.0
  • PyTorch version (GPU?): 1.13.0+cu117 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using GPU in script?: no
  • Using distributed or parallel set-up in script?: no

Who can help?

@ArthurZucker @younesbelkada @Narsil

Information

  • [X] The official example scripts
  • [ ] My own modified scripts

Tasks

  • [X] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • [ ] My own task or dataset (give details below)

Reproduction

As stated in the documentation: https://huggingface.co/docs/transformers/model_doc/blenderbot-small#transformers.BlenderbotSmallForConditionalGeneration.forward.example the model should use </s> and <s> for separating the user input and response:

from transformers import AutoTokenizer, BlenderbotSmallForConditionalGeneration

mname = "facebook/blenderbot_small-90M"
model = BlenderbotSmallForConditionalGeneration.from_pretrained(mname)
tokenizer = AutoTokenizer.from_pretrained(mname)
UTTERANCE = "My friends are cool but they eat too many carbs."
print("Human: ", UTTERANCE)

inputs = tokenizer([UTTERANCE], return_tensors="pt")
reply_ids = model.generate(**inputs)
print("Bot: ", tokenizer.batch_decode(reply_ids, skip_special_tokens=True)[0])

REPLY = "I'm not sure"
print("Human: ", REPLY)

NEXT_UTTERANCE = (
    "My friends are cool but they eat too many carbs.</s> <s>what kind of carbs do they eat? "
    "i don't know much about carbs</s> "
    "<s> I'm not sure."
)
inputs = tokenizer([NEXT_UTTERANCE], return_tensors="pt")
next_reply_ids = model.generate(**inputs)
print("Bot: ", tokenizer.batch_decode(next_reply_ids, skip_special_tokens=True)[0])

However, these tokens are not present in the vocabulary or special tokens

I assume they should be replaced with __start__ and __end__?


I have also tried to use the ConversationPipeline, and follow steps outlined here, but I always get nonsensical results.

Even when trying the hosted inference API for the model (https://huggingface.co/facebook/blenderbot_small-90M), it either repeats itself, or doesn't follow in conversation.

Expected behavior

The tokens should be correct, and the chatbot should engage in more realistic conversation

xenova avatar Mar 21 '23 19:03 xenova