fast-bert icon indicating copy to clipboard operation
fast-bert copied to clipboard

New issue with bert tokenizer

Open aubluce opened this issue 4 years ago • 1 comments

TypeError Traceback (most recent call last) in () 10 multi_gpu=False, 11 multi_label=True, ---> 12 model_type='bert')

4 frames /usr/local/lib/python3.6/dist-packages/fast_bert/data_cls.py in init(self, data_dir, label_dir, tokenizer, train_file, val_file, test_data, label_file, text_col, label_col, batch_size_per_gpu, max_seq_length, multi_gpu, multi_label, backend, model_type, logger, clear_cache, no_cache) 365 if isinstance(tokenizer, str): 366 # instantiate the new tokeniser object using the tokeniser name --> 367 tokenizer = AutoTokenizer.from_pretrained(tokenizer, use_fast=True) 368 369 self.tokenizer = tokenizer

/usr/local/lib/python3.6/dist-packages/transformers/tokenization_auto.py in from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs) 193 if isinstance(config, config_class): 194 if tokenizer_class_fast and use_fast: --> 195 return tokenizer_class_fast.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs) 196 else: 197 return tokenizer_class_py.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)

/usr/local/lib/python3.6/dist-packages/transformers/tokenization_utils.py in from_pretrained(cls, *inputs, **kwargs) 391 392 """ --> 393 return cls._from_pretrained(*inputs, **kwargs) 394 395 @classmethod

/usr/local/lib/python3.6/dist-packages/transformers/tokenization_utils.py in _from_pretrained(cls, pretrained_model_name_or_path, *init_inputs, **kwargs) 542 # Instantiate tokenizer. 543 try: --> 544 tokenizer = cls(*init_inputs, **init_kwargs) 545 except OSError: 546 raise OSError(

/usr/local/lib/python3.6/dist-packages/transformers/tokenization_bert.py in init(self, vocab_file, do_lower_case, do_basic_tokenize, never_split, unk_token, sep_token, pad_token, cls_token, mask_token, clean_text, tokenize_chinese_chars, add_special_tokens, strip_accents, wordpieces_prefix, **kwargs) 618 strip_accents=strip_accents, 619 lowercase=do_lower_case, --> 620 wordpieces_prefix=wordpieces_prefix, 621 ), 622 unk_token=unk_token,

TypeError: init() got an unexpected keyword argument 'add_special_tokens'

aubluce avatar Apr 28 '20 03:04 aubluce

This is not enough information. Kindly provide steps to reproduce the issue.

aaronbriel avatar May 08 '20 18:05 aaronbriel