biobert-pytorch
biobert-pytorch copied to clipboard
dmis-lab/biobert-base-cased-v1.1 Tokenizer lower cases the input
Hi,
Thank you for the releasing this codebase.
I noticed that when we load dmis-lab/biobert-base-cased-v1.1
from HF Models with BertTokenizer.from_pretrained
the tokenizer's default behavior sets do_lower_case=True
. Lack of tokenizer_config.json
here compared to this could be the reason.
Is this behavior intended? This is unexpected for a user unless they probe for it.
I'm using transformers==4.12.5
.
Thanks, Saad