BioGPT
BioGPT copied to clipboard
README example fails: 'NoneType' object has no attribute 'tokenizer'
Hi there, I followed the README example set up script exactly, and received the following error. May you please help me resolve?
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[2], line 1
----> 1 m = TransformerLanguageModel.from_pretrained(
2 "checkpoints/Pre-trained-BioGPT",
3 "checkpoint.pt",
4 "data",
5 tokenizer='moses',
6 bpe='fastbpe',
7 bpe_codes="data/bpecodes",
8 min_len=100,
9 max_len_b=1024)
File ~/miniforge3/envs/test_env/lib/python3.10/site-packages/fairseq/models/fairseq_model.py:261, in BaseFairseqModel.from_pretrained(cls, model_name_or_path, checkpoint_file, data_name_or_path, **kwargs)
238 """
239 Load a :class:`~fairseq.models.FairseqModel` from a pre-trained model
240 file. Downloads and caches the pre-trained model file if needed.
(...)
257 model archive path.
258 """
259 from fairseq import hub_utils
--> 261 x = hub_utils.from_pretrained(
262 model_name_or_path,
263 checkpoint_file,
264 data_name_or_path,
265 archive_map=cls.hub_models(),
266 **kwargs,
267 )
269 cls.upgrade_args(x["args"])
271 logger.info(x["args"])
File ~/miniforge3/envs/test_env/lib/python3.10/site-packages/fairseq/hub_utils.py:70, in from_pretrained(model_name_or_path, checkpoint_file, data_name_or_path, archive_map, **kwargs)
67 if "user_dir" in kwargs:
68 utils.import_user_module(argparse.Namespace(user_dir=kwargs["user_dir"]))
---> 70 models, args, task = checkpoint_utils.load_model_ensemble_and_task(
71 [os.path.join(model_path, cpt) for cpt in checkpoint_file.split(os.pathsep)],
72 arg_overrides=kwargs,
73 )
75 return {
76 "args": args,
77 "task": task,
78 "models": models,
79 }
File ~/miniforge3/envs/test_env/lib/python3.10/site-packages/fairseq/checkpoint_utils.py:279, in load_model_ensemble_and_task(filenames, arg_overrides, task, strict, suffix, num_shards)
277 if not PathManager.exists(filename):
278 raise IOError("Model file not found: {}".format(filename))
--> 279 state = load_checkpoint_to_cpu(filename, arg_overrides)
280 if shard_idx == 0:
281 args = state["args"]
File ~/miniforge3/envs/test_env/lib/python3.10/site-packages/fairseq/checkpoint_utils.py:231, in load_checkpoint_to_cpu(path, arg_overrides)
229 if arg_overrides is not None:
230 for arg_name, arg_val in arg_overrides.items():
--> 231 setattr(args, arg_name, arg_val)
232 state = _upgrade_state_dict(state)
233 return state
AttributeError: 'NoneType' object has no attribute 'tokenizer'
My environment variables are set as:
(test_env) an583@PHS030015 project_dir % conda env config vars list
MOSES = /Users/ayush/project_dir/mosesdecoder
FASTBPE = /Users/ayush/project_dir/fastBPE
I would be grateful for assistance to help resolve. Thanks!
I fixed this by reinstalling fairseq to 0.12.0. The package got reinstalled to versin 0.10.0 while installing sacremoses.
fairseq version == 0.12.0: