BioGPT
BioGPT copied to clipboard
Tokenizer encodes special tokens as two element list
Using the huggingface implementation of biogpt tokenizer i expected only 1 element but got 2.
from transformers import BioGptTokenizer
tokenizer= BioGptTokenizer.from_pretrained('microsoft/biogpt')
tokenizer.encode(tokenizer.eos_token)
output: [2, 2]