bunkai
bunkai copied to clipboard
AttributeError: 'JanomeSubwordsTokenizer' object has no attribute 'vocab'
Getting the following error as I follow the readme. I hope what I do in what environment is already clear... Let me know if you need more info
Python 3.11.6 (main, Nov 2 2023, 04:39:40) [Clang 14.0.0 (clang-1400.0.29.202)] on darwin Type "help", "copyright", "credits" or "license" for more information.
from bunkai import Bunkai from pathlib import Path bunkai=Bunkai(path_model=Path('bunkai-model-directory'))
Traceback (most recent call last): File "", line 1, in File "/opt/homebrew/lib/python3.11/site-packages/bunkai/algorithm/bunkai_sbd/bunkai_sbd.py", line 78, in init _annotators.insert(_idxs[0] + 1, LinebreakAnnotator(path_model=path_model)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/lib/python3.11/site-packages/bunkai/algorithm/bunkai_sbd/annotator/linebreak_annotator.py", line 16, in init self.linebreak_detector = Predictor(modelpath=path_model) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/lib/python3.11/site-packages/bunkai/algorithm/lbd/predict.py", line 43, in init self.tokenizer = JanomeSubwordsTokenizer(self.path_tokenizer_model) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/lib/python3.11/site-packages/bunkai/algorithm/lbd/custom_tokenizers.py", line 136, in init super(BertTokenizer, self).init( File "/opt/homebrew/lib/python3.11/site-packages/transformers/tokenization_utils.py", line 367, in init self._add_tokens( File "/opt/homebrew/lib/python3.11/site-packages/transformers/tokenization_utils.py", line 467, in _add_tokens current_vocab = self.get_vocab().copy() ^^^^^^^^^^^^^^^^ File "/opt/homebrew/lib/python3.11/site-packages/transformers/models/bert/tokenization_bert.py", line 240, in get_vocab return dict(self.vocab, **self.added_tokens_encoder) ^^^^^^^^^^ AttributeError: 'JanomeSubwordsTokenizer' object has no attribute 'vocab'